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Preface 


This text has been written for use in a one-semester precalculus course for 
students who have had a good preparation in high school mathematics. Most 
high schools now give courses which cover all of the topics needed to begin 
calculus, and many high schools even offer an introduction to the calculus. 
However, only the very best students obtain from their high school courses the 
mathematical sophistication necessary to begin a college calculus course in which 
the emphasis is on the concepts as well as the techniques. The great majority 
of the students must be started in a calculus course which begins very slowly, 
or they must be given a precalculus course which helps develop their mathe- 
matical insight. The total amount of time required to reach the end of the basic 
calculus sequence is about the same in either case. However, the second method 
offers some extra advantages, since the students can at the same time learn 
something new which they may find useful. 

There is a basic problem in the design of any first-year college course in 
mathematics. This is the great variation in the mathematical training offered 
in the different high schools. Some students have had a complete set of modern 
courses, such as those represented by the SMSG sample textbooks. Others 
have had a few such courses. But many are still graduating from high school 
with three or three and a half units of strictly traditional material. 

No single course can cover this entire range. Each textbook must make some 
assumptions about what the student already knows and what type of training 
he has had. In this text, we assume that the student has some knowledge of 
trigonometry and elementary analytic geometry. Furthermore, we assume 
that the student has been exposed to at least one course with the modern point 
of view. To bridge the wide gap that still remains, the first two chapters of this 
text have been included, which offer a review of the material which the student 
is assumed to have seen already. 

The instructor must decide how rapidly the material in these first two chapters 
can be covered. It should be noted that this material is presented.in too brief 
a fashion to be suitable for the student who has never seen these topics before. 
To learn this material from the start would require an entire semester for most 
students. On the other hand, a very well-prepared student could skim over these 
two chapters in two or three weeks. The average amount of time which might 
be spent on these two chapters with a normal class would be about four weeks. 

It is recommended that this review material be included in the course, even 
for rather well-prepared students. There are several reasons for this. First, 
these chapters cover all of the topics needed to proceed with the rest of the 
book; they contain the definitions of our terminology and tell us what we can 
assume to be known in our proofs. The inclusion of this material makes the 
entire text (essentially) complete in its mathematical development. Secondly, 
the student who has taken mostly traditional courses will learn here some of the 
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modern mathematical terminology and will be introduced to some aspects of 
this point of view. Finally, if a student has had a number of the more modern 
courses, the contents of these chapters will offer him a fast review and should 
also show him how we can be precise when necessary and relax our precision 
when it is unimportant. In many cases, such a student will find that the de- 
velopment given here differs from what he has seen before. This may help 
him understand that there is no single “correct” way of developing math- 
ematics. 

Chapter 3 starts the “new” material in this text. This chapter introduces 
the mathematical study of vectors. As mentioned above, the main purpose of 
this text is to prepare students for the study of the calculus and other future 
courses. Many topics could be chosen which would be useful for this purpose. 
We have chosen to study vectors, because they are useful to engineers and 
physicists and because they can be used to prepare the student for the study of 
vector spaces, linear algebra, functions of several variables, and functional 
analysis. 

Because of this effort to build for the future, some of the material studied 
may seem rather strange. Obvious things are sometimes looked at in what may 
appear to be an unnecessarily abstract point of view. When this happens, it is 
done with good reason. An attempt is being made to introduce the student to 
the proper point of view in order to facilitate his future studies. On the other 
hand, there are also places where a more abstract point of view could have been 
taken, and yet was not. The emphasis throughout the chapters on vectors is 
on the geometric picture. Excessive abstraction cannot be allowed to interfere 
with geometric understanding. In fact, we make use of geometric intuition to 
motivate the abstract mathematical development. Many students will see here 
for the first time an abstract mathematical system developed to fit a specific 
intuitive picture. 

In Chapter 5, we discuss the conic sections. Many students will already be 
familiar with a more standard development of the conic sections. If so, we 
hope that they will find this approach new and interesting. The student who 
knows nothing about the conic sections will have to work hard here, since the 
discussion is rather rigorous and somewhat brief. While the vector methods 
developed in earlier chapters are used wherever convenient, we avoid using 
these for their own sake when other methods might be more efficient or more 
revealing. 

The last chapter discusses the rotation of coordinates and ends with two 
sections in which the quadric surfaces and polar coordinates are discussed in as 
concise a manner as possible. These topics are useful in a calculus course and 
are included for this reason. Unfortunately, there would not be time left in a 
one-semester course to expand on these topics beyond what is given here. 

The last comment may also help to explain why other material was not in- 
cluded. Each topic added to a one-semester course would require the deletion 
of something else. The topics included here were chosen to serve the basic 
purposes of the course in what seemed to be the best way. Other choices could 
also have been made, but it would require a great deal of revision to make more 
than very minor changes. 
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This text has been used for several semesters in a preliminary edition. The 
results of these trials were quite satisfactory. The students found the material 
difficult, of course, but they acquired the desired knowledge and point of view. 
Both the students and the instructors seemed to find the text interesting. Their 
many helpful comments and suggestions were taken into account in preparing 
the final version of the text. 

I wish to express my appreciation to all of my colleagues for the many dis- 
cussions which helped to mold this text into its present form. In particular, I 
wish to thank Professor Stanley Jackson for his help in reading the manuscript 
in its several versions and for his many thoughtful suggestions. 


College Park, Maryland J.A.H. 
December 1964 
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The Real Number System 


1-1 MATHEMATICAL REASONING 


In this book an attempt is made to introduce the student to the “mathe- 
matical way of thinking.” This is not to imply that there is some essential 
difference between this and any other (logical) way of thinking; the main 
point of mathematical reasoning is common to all disciplines: precision. 
There are, however, some special features, peculiar to mathematics, which 
require study. 

The student is expected, and requirea, to learn how to operate the mathe- 
matical “tools” and—this is of equal importance—he must learn to judge 
when these tools may (and when they may not) be applied. For these 
purposes, he must learn to distinguish between what is known and what 
is assumed, between the “actual world” and the “mathematical model” 
of it, and between a plausible argument and a proof. Few students will 
be required to be able to provide proofs for mathematical facts after 
leaving school, but many will find it essential that they be able to under- 
stand such proofs. In particular, they must know on what assumptions 
the particular results are based. 

As we study the various topics in this text, we will see how basic as- 
sumptions (axioms) are used to derive results. At this point, however, 
let us discuss something else which must be understood: mathematical 
definitions. 

In a dictionary a definition is supposed to explain the meaning and the 
use of a word. Of course, the definitions of one word are given in terms of 
other words. For example, we might find that the word cord would have 
“a string or small rope” as one of its meanings. This means that wherever 
applicable this phrase could be substituted in the place of the word “cord” 
without changing the meaning of the sentence in question. Suppose 
however, that one knew no English at all. What good would the English 
dictionary be? Imagine that you had a Russian dictionary with both 
the words and the definitions in Russian, and that you did not know any 
Russian. Every word of the language is there, and each word is defined. 
Yct this dictionary would be of no use to you if you wished to read some- 
thing written in Russian. 
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In compiling a dictionary, the editors must assume that the users al- 
ready know something about the language. If a definition is given for 
every word, then some circular definitions must be present. For example, 
in the same dictionary which defined a cord as “a string or small rope” 
we find a string defined as “a small cord or slender strip of leather,” and a 
rope defined as “a large stout cord made of strands of fiber or wire twisted 
or braided together.” 

One way of explaining the mathematical point of view is to describe 
the way in which a mathematician would write a dictionary. First, there 
would be given a list of “primitive words” which would be left undefined. 
Next, there would be a list of words whose definitions used only the primi- 
tive words. Finally, there would be the remaining words so arranged that 
the definition of each made use only of primitive words or words which 
had been already defined. 

A mathematical definition differs from a dictionary definition in another 
way. A good dictionary will attempt to explain the meaning and the use 
of a word and to distinguish it from other words with a similar, but not 
identical, meaning. A definition in mathematics will merely give an 
equivalent to the term being defined. Explanations of the meaning and 
the usage of the term must be given separately. This is true even for the 
terms which are taken as undefined. An essential characteristic of most 
works in mathematics is the precise use of terms. The exact meanings of 
these terms must be known. 

Let us give an example of what we mean by discussing the mathemat- 
ical use of the word set. We assume that the student has been introduced 
to this concept already, but even if he hasn’t, the basic idea is simple 
enough to grasp. We are treating the concept of a set as an undefined 
term, but this does not prevent us from explaining carefully what we 
mean when we use the term. 

A set is a collection of things which are called the elements of the set. 
In our use of this term, there are only two things which need to be kept 
in mind. First, an element either is in a set or itis not. If a set is specified 
by listing its elements, an element which is listed more than once is still 
in the set only once. 

Secondly, the elements in a set have no order. Listing the elements of 
a set puts them in some order, but this order is not part of the structure 
of the set. In loose, but picturesque, terms this idea can be expressed by 
saying that a set is a “bag full of elements. ” 

One of the sources of the power of mathematics is in the use of symbols. 
Thus elementary algebra has as its basic idea the introduction of letters 
to symbolize real numbers. In the same way, it is useful to introduce 
symbols, single letters, to represent sets. Capital italic letters are used 
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for this purpose in this text. Unfortunately, the same letters are used for 
other purposes also, but in every case the meaning of the symbol used 
should be clear from the context. 

If A is a set and zx is an element of that set, we write 


x EÁ, 


which can be read, “x is in A” or “z is an element of A.” 

When a set is specified by listing all of its elements, it is customary to 
indicate the set itself by enclosing the list in braces. Thus the set whose 
elements are the three integers 1, 2, and 3 is written as 


{1, 2, 3}. 


When a set is specified by giving some rule which allows us to determine 
whether or not a given element is in the set, another notation is used. 
For example, 

{x | x is an integer, x > 1} 


is read: “The set of all x such that x is an integer and x > 1.” The braces 
indicate that we are defining a set. The x before the vertical bar is the 
symbol indicating the general form of the elements of the set we are 
talking about. It isa “dummy variable,” meaning that any other symbol 
can be used in its place, just as we can use x, y, or any other letter to 
indicate some unknown number in an algebraic problem. The rule for 
determining whether or not an element is in the set is given after the 
vertical bar. It is to be noted that the vertical bar does not mean “such 
that”; it is merely a symbol which divides the “dummy variable” from 
the rule and, in this context, can be read as “such that.” 

Usually the type of element under consideration is understood and no 
explicit statement of its nature need be made inside the symbol for the 
set. That is, if it were understood that we were considering only the 
integers, the above set could be written {x |x > 1}. However, this will 
usually be done only when the elements under consideration are the real 
numbers, or doubles, or triples of real numbers. In any case, it will always 
be evident which elements are to be considered as possible candidates for 
inclusion in the set. 

If A and B are two sets such that every element of A is also an element 
of B, then A is called a subset of B, and we write A C B. Thus, for ex- 
ample, the set {x| 1 < x < 2} (a being allowed to be any real number) 
is a subset of the set: of all real numbers. It is also a subset of the set of 
all positive numbers, {xz |x > 0}. 
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There is one peculiar type of set which frequently causes confusion, the 
empty set, which is also called the null set or void set. It is defined to be the 
set which contains no elements. The symbol commonly used for the empty 
set is Ø. The empty set occupies the same position in discussions of sets 
as the number zero occupies in the discussion of integers, and is often ob- 
jected to by the beginning student for the same reason that the number 
zero was objected to when it was first introduced. Since it represents 
nothing, why do we need a symbol for it? If the student can answer this 
question for the number zero, then he should be able to answer it also for 
the empty set. 

As the student goes through this text, in addition to learning the ma- 
terial, he should spend time considering why the definitions are given as 
they are, and why the proofs of theorems are given in the form that they 
are. The student is not expected to be very familiar with mathematical 
proofs when he starts this course. Therefore, when proofs are given in 
the text, comments about their logic will be made. It is hoped that by the 
end of the course, the student will know a good deal more about how 
theorems in mathematics can be proved. 


PROBLEMS 


1. Each of the following sets is to be a subset of the set of all real numbers. 
Some are equal; some are subsets of others. Find all the relations between 
the sets. (Note: equality is always used in the sense of identity.) 


A= {f|0<2z < 1} B = {t| <4 
C = {y|y? < 1} D = {z|z2? <1 and z> 0} 
E = {w| —2 < w< 2 F = {t|t> 0} 


2. Write in set notation: 
(a) The set of all real numbers whose squares are less than 2 
(b) The set of all pairs, (x, y), of real numbers for which the first member of 
the pair is smaller than the second 
(c) The set of all even integers 
(d) The set of all integer multiples of 5 
(e) The set of all positive real numbers whose squares are less than 2 


3. Write each of the following sets of real numbers in a simpler form such as 
{r|a <a < b}. 


(a) {x| 2? — zr < 0} (b) {|2?+2+1 < 0} 
(c) {x|27+2—2 < 0} (d) {r| 1? + 384+ 42> 0} 


4. According to the definition of a subset, is a set a subset of itself? Explain. 


5. According to the definition of a subset, if B is some set, is Ø C B? Explain. 
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6. Two sets A and B are equal, A = B, if and only if they consist of exactly 
the same elements. To show that two sets are equal, you must show that 
every element in one is in the other, and vice versa. Identify each of the 
following statements as either true or false. Explain your statement. (Note 
that in order to disprove a statement, it is only necessary to give a single 
example in which the statement does not hold.) 


(a) If AC Band BC A, then A = B. 
(b) If AC B,C C D,and A = C, then B = D. 
(c) If AC BandC CB, then A = C. 
(d) If AC Band BCC,then ACC. 


7. (a) Let Ai be a set containing exactly one element. How many subsets does 
A, have? 
(b) Let Ag be a set containing exactly two elements. How many subsets does 
Ag have? 
(c) Let A3 be a set containing exactly three elements. How many subsets 
does A3 have? 
(d) How many subsets are there of a set which has exactly n elements? 


8. Using a dictionary of “collegiate” size or larger, find a circle of definitions. 
That is, look up some noun. Choose a noun in the definition which is crucial 
to at least one part of the definition. Look up this noun and continue in this 
way until you arrive at a noun already in the chosen list. 


9. Using a good dictionary, try to find the exact difference between the following 
pairs of words. 


(a) enclose and inclose (b) further and farther 
(c) whence and whither (d) should and would 
(e) as and like (f) imply and infer 

(g) affect and effect (h) that and which 
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The student probably feels that he has a good working knowledge of 
the real numbers. After all, he has been studying them for years. In this 
section, and the next few sections, we wish to review the basic properties 
of the real numbers. The student may recognize the particular properties 
which we single out as being the rules of elementary algebra, but he is 
warned that the attitude adopted toward them here is different. We dis- 
cuss these properties in terms of the axioms which define them; and the 
particular axioms we choose to discuss will be the important thing. 

No attempt will be made to define the real number system here. In- 
stead, we shall merely list the properties which distinguish the real number 
system from other more or less similar mathematical systems (such as the 
integers). As will be seen, the particular properties that will be discussed 
arc fundamental. Some of them will appear again as properties of entirely 
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different systems. Some of them will prove to be false in other mathe- 
matical systems, but all of them should be known by the student in order 
to help him gain an understanding of what underlies the mathematics 
he is to learn. 

Let us start by making a precise statement of our assumptions about 
the real number system: 


The real number system 1s a set of elements on which two binary operations, 
called addition and multiplication, and a binary relation, called order, are 
defined. 


We shall not define the terms binary operation or binary relation in 
general at this point, but only explain them in the given context. 

The binary operation of addition on the real numbers associates to every 
pair of real numbers, a and b, a unique real number c, called the sum of 
aand b. We write c = a + b to represent this sum. The binary operation 
of multiplication on the real numbers similarly associates to every pair of 
real numbers, a and b, a unique real number d, called the product of a and b. 
We write d = ab, ord = a- b, to represent this product. 

These statements explain what we mean by the binary operations of 
addition and multiplication. They do not, however, explain what the 
operations are, or how they behave. (The binary relation of order will be 
studied in the next section, so we shall not discuss it here.) 

In order to try to explain what addition and multiplication are, we list 
some of the properties of these operations. These properties are nothing 
more than the basic laws of algebraic manipulation and will be well 
known to any student who is familiar with elementary algebra. The par- 
ticular set of properties we choose to write down may appear to be rather 
brief from this point of view, but it so happens that one can prove the 
computational properties of the real numbers, assuming nothing more 
than these. These properties are therefore the axioms which determine 
the elementary properties of the real numbers. 

Before listing these axioms, we should say a word about equality of 
real numbers, or about equality in general. In this text, the equality sign, 
=, is used to mean actual identity of the elements separated by it. That 
is, we write a = b, if and only if a and b are actually the same. This can 
also be explained by saying that a and b are two names for the same thing. 
Actually, what we are doing is using the equality sign in its meaning as 
applied to the elements of a set (the set of real numbers in this case). 

The axioms we now introduce are called the field axioms, since they 
constitute the axioms for a special mathematical structure known as a 
field. The student will encounter the concept of fields again if he takes a 
course in abstract algebra. 
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The Field Axioms for the Real Numbers 

l. The Commutative Law for Addition. For any real numbers a and b, 
a+b=b+a. 

2. The Commutative Law for Multiplication. Tor any real numbers a and b, 


ab = ba. 


Ca 


The Associative Law for Addition. For any real numbers a, b, and c, 

a + (b +c) = (a +b) +c. 
4. The Associative Law for Multiplication. For any real numbers a, b, and c, 
a(bc) = (ab)c. 


5. The Existence of the Identity for Addition. There exists a real number 0 
such that if a is any real number, 


a+0=0+a=a. 


6. The Existence of the Identity for Multiplication. There exists a real number 
1 Æ O such that if a is any real number, 


7. The Existence of Inverses for Addition. For any real number a, there exists 
a corresponding real number —a such that 


a+ (—a) = 0. 


8. The Existence of Inverses for Multiplication. For any real number a ~ 0, 
there exists a corresponding real number a! such that 


a-a'=1. 
9. The Distributive Law. For any real numbers a, b, and c, 


a(b + c) = ab + ac. 


A few comments about these nine axioms are called for at this point. 

First, many mathematicians would add to this set two further axioms, 
the closure axioms. These state that if a and b are real numbers, then 
a + b and ab are also real numbers. The uniqueness of a + b and ab is 
usually included in these same axioms. We prefer to think of these prop- 
ertics as being implied by the assumption that the binary operations of 
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addition and multiplication are defined on the set of real numbers, and 
that the result of these binary operations is in every case a unique real 
number. These properties would become important if we wished to discuss 
the operations of addition and multiplication on subsets of the real num- 
bers or if we were to discuss similar operations in other mathematical con- 
texts. We also need to quote the closure and uniqueness properties as the 
reason for some steps in proofs (as we will see below). 


Definition 1-1. Let R be a given set and let a binary operation be defined 
on R; that is, for every pair of elements a, b in R, there exists a unique 
element c in R which we write asc = a @ b. Let S be a subset of R. 
Then we say that the set S is closed with respect to the operation G if 
and only if for every a and b in S, a @ bis also in S. 


This formal definition should help make clear exactly what is meant 
by the closure property. We shall not give a formal definition of the 
uniqueness property, since we understand this to be an integral part of 
the concept of a binary operation. 

The student should observe the names attached to the various axioms. 
The properties described by these axioms are fundamental and will be 
found over and over again in different mathematical contexts. If, therefore, 
the student does not already know these properties by name, he is advised 
to learn them; these names are an essential part of the language of mathe- 
matics. 

Finally, observe closely the wording of these axioms. The order in 
which the phrases occur is most important. Here the content is familiar, 
and it is easy to slide over the full significance of the various phrases. 

Look, for example, at the. fifth axiom. It says that there exists a real 
number zero and that this number exists once and forever, without any 
regard to the real number a that it is being added to. In the seventh axiom, 
however, the order of the phrases is reversed. This statement asserts the 
existence of the negative of a number, once we are gwen the number. No 
implication is made that there is a single, universal number —a. Look 
at the various statements and okserve the logic of their construction. 
Try to see how the assertions made would be altered if the statements 
appeared in a different order. 

From these nine axioms all of the purely arithmetic properties of the 
real number system could be proved. For example, the obvious exten- 
sions (which could be formally proved) of the first four of these axioms 
permit us to write sums or products in any order without having to worry 
about introducing parentheses to specify which operations should be done 
first. 
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While we do not wish to spend much time on the development of the 
algebraic properties of the real number system from these axioms at this 
point, two particular results which show how this development could 
proceed might be of interest. We will sketch their proofs. 


Theorem 1-1. If a and b are two real numbers, then there exists a unique 
real number «x such that 


at+z= b. 
Proof: We first note that the real number 
(—a) E b, 


whose existence is guaranteed by Axiom 7, does indeed satisfy the require- 
ment for x. This is proved by successive applications of Axioms 3, 7, and 5. 
To see that the solution, z, is unique, suppose that a + x = b and 
a + y = b. Then from the identity of the numbers involved, a + z = 
a + y. We then merely need to add (—a) to this number in its two repre- 
sentations to conclude (after using Axioms 3, 7, and 5) that xz = y. 


Theorem 1-2. If ais any real number, then 


a-0 = 0. 
Proof: We have that 
1+0=1, 
and hence that 
a(il+0)=a-:1 


But then, from the distributive law 


a-l+a-O0=a 
or 
at+a-O0= a4. 


However, we know that a + 0 = a, and hence from Theorem 1-1 we 
conclude that a-0 = 0. 

The proofs of the above theorems have been written in the typical 
informal style used in mathematical works these days. They could also 
have been given in a formal style, with each step being given a full justifica- 
tion. 
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For example, the formal proof of Theorem 1-2 would look like this: 


STATEMENT REASON 

(1) 1+0= 1 Existence of the identities for addi- 
tion and multiplication 

(2) al+0)=a-1 Uniqueness of multiplication 
(3) a-‘l=a Identity for multiplication 
(4) a1 +0) =a Equality of numbers in (2) and (3) 
(5) a(l+0) =a-1+a-0 | Distributive law 
(6) al+0)=a+a-:0 Equality of numbers in (3) and (5) 
(7) ata-0=a Equality of numbers in (4) and (6) 
(8) a+0 =a Existence of identity for addition 
(9) a-0= 0 Theorem 1-1 applied to (7) and (8) 


If the student writes out the complete formal proofs of a few theorems, 
he will see how we can say that these results can be derived from the 
axioms alone. However, these formal proofs are usually too long and too 
detailed for ordinary use. The fragmentation of the steps makes it 
difficult to see exactly what the main point of the proof is. Informal 
proofs need to give only enough detail to be convincing. Correctly done, 
the informal proof should give the reader enough information to enable 
him to write out the complete proof in a formal fashion if required. 

The proof of Theorem 1-1 contains two distinct parts. In the first 
part, it is shown that a solution of the given equation does indeed exist. 
This is done directly, by exhibiting a number whose existence is proved 
from the axioms and which satisfies the equation. The second part of the 
proof shows that this solution is unique. This is proved by showing that 
any two numbers which satisfy the equation must in fact be the same. 
Note that proving this part alone does not prove the entire theorem. It 
might well be possible to prove that any two solutions of a given equation 
would be equal when there were in fact no solutions at all. Another way 
of saying this is to observe that the first part of the proof of Theorem 1-1 
shows that there is at least one solution while the second part shows that 
there is at most one. 


PROBLEMS 


1. Show that the set of all integers, with the usual operations of addition and 
multiplication, does not satisfy all of the field axioms. 


2. The rational numbers can be represented in the form p/q where p and q are 
integers, with q ~ 0. Two pairs of integers, r/s and p/q, represent the same 
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rational number if and only if rg = ps. Addition and multiplication of 
rational numbers is defined by 


r/s + p/q = (rg + ps)/(sq), 
(r/s)(p/q) = (rp)/(sq). 


The set of all rational numbers satisfies the field axioms. Prove that Axioms 
7, 8, and 9 are satisfied, assuming that the first six have already been proved. 


3. Show that the set consisting of the two elements 1 and 0 satisfies the field 
axioms if we suppose that 


0+0 = 0, 0-0 = 0, 
0+1=1+4+0 =1, 0-1=1-0= 0, 
1+1=0, 1:1 = 1 


4. Prove from the field axioms that if a-b = 0, then either a or b is zero. 
[Hint: suppose one of them, say b, is not zero. Use Axiom 8 to prove that a 
must then be zero.] 


5. Let V be the set of all possible pairs of real numbers, (a, b). Two elements 
of V are equal if and only if both consist of the same pair of numbers in the 
same order. Define the sum and product in V by 


(a, b) + (c,d) = (a+ c, b+ d), 
(a, b) - (c,d) = (ac, bd). 


Which of the field axioms hold in V? Give an example showing the failure 
of any of the axioms which are not true. Give an example showing the 
failure of the property of Problem 4. 


6. With the same set V as Problem 5, define sums and products by 
(a,b) + (c,d) = (a+ c,b + d), 
(a, b) - (c,d) = (ac — bd, ad + be). 
Which of the field axioms hold in this case? 


7. Let m be any positive rational number which is not the square of a rational 
number. Show that the set of all numbers of the form a+ bm, where a 
and 6 are rational, satisfies the field axioms. 


8. Write out a full formal proof of Theorem 1-1. 
9. Prove that if a-a = a, then either a = Oora = 1. 
10. Which of the field axioms are satisfied by the set of positive real numbers? 


11. Prove that for any a, —a = (—l1)a. [Hint: 1 + (—1) = 0. Multiply both 
sides by a.] 


12. Prove that for any a, —(—a) = a. 
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13. Each of the following binary operations is defined on the set of real numbers. 
Check each operation as to whether or not it is commutative or associative. 
If there is an identity for the operation, state what it is. If there is none, 
show why not. If an identity for the operation does exist, do inverses exist? 
(a) axb = (a + b) 
(b) a A b = the larger of a and b 
(c) aob = a+ 26 


14. For each of the following subsets of the reals, check whether the subset is 
closed under the three operations defined in Problem 13: 
(a) the set of all integers 
(b) the set of all positive reals 
(c) the set of all real numbers whose square is less than or equal to one 


1-3 THE ORDER AXIOMS 


The field axioms listed in the previous section fail to characterize the 
real numbers. The fact that the examples in Problems 2 and 3 of the last 
section satisfy the field axioms shows this to be true. If the student is 
familiar with complex numbers, he can check to see that the set of com- 
plex numbers also satisfies the field axioms (this was Problem 7 of the 
last section). 

A property of the real numbers that is not shared by the field of complex 
numbers, or one of the fields of the type seen in Problem 3 of the last sec- 
tion, is order. The order relation of the real numbers is linked to the 
field properties by certain axioms. 


The Order Axioms for the Real Numbers 


There exists an order relation on the real numbers. For every pair of real 
numbers a and b this relation is either true or false. If it is true, we say that 
a is less than b and writea < b. If it 1s not true, we write a x b. This 
relation satisfies: 


I. The Trichotomy Law. For any pair of real numbers a and b, one and only 
one of the following holds: 


(a) a < b, (b) a = b, (c) b <a. 
2. The Transitive Law. If a, b, and c are real numbers such that a < b 
and b < c, thena < c. 


3. The Addition Law. If a, b, and c are real numbers and a < b, then 
a+ece<b+ec. 


4. The Multiplication Law. If a, b, and c are real numbers such that a < b 
and 0 < c, then ac < bce. 
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From these axioms, all of the familiar properties of the order relation 
can be proved. For example: 


Theorem 1-3. Ifa < bandc < d, thena +c < b+d. 


Proof: Since a < b, from Axiom 3 we have 
ate<b-+e. 

Similarly, since c < d, we have 
b+e<b-+d. 


The transitive law applied to these two inequalities then gives the desired 
conclusion, a + c < b + d. 

Other results can be proved just as easily. We will list a few of these 
without proof. The student should study them to be sure he knows and 
can apply the results. In the statement of these theorems, we will use a 
few symbols and conventions which have not been formally introduced, 
but this should cause no difficulty. The student knows, for example, 
that a > b means b < a, that a < b means that either a < bora = b, 
that a — b means a + (—b), and so forth. 

Similarly, we will assume that all of the computational consequences 
of the axioms of the last section have been proved. For example, we may 
assume that —a = (—1)a, that (—a)(—b) = ab, and so on. It is not 
our purpose to give a detailed development of all of the properties of the 
real number system. Rather, we are only interested in showing how this 
might be done and pointing out that all of these properties depend on a 
very few axioms. 


Theorem 1-4. If a > 0, then —a < 0. 

Theorem 1-5. If a < b, then b — a > 0. 

Theorem 1-6. If b — a > 0, thena < b. 

Theorem 1-7. If a < b then —b < —a. 

Theorem 1-8. If b < 0, then —b > 0. 

Theorem 1-9. Ifa < band c < 0, then ac > be. 
Theorem 1-10. 1 > 0. 

Theorem 1-11. If 0 < a < b,then0 < b7! < a™!. 
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There is an entirely different way of looking at the order properties of 
the real numbers, typified by Theorems 1-4, 1-5, and 1-6. This method 
involves consideration of those real numbers which are positive, and leads 
to an alternative set of axioms to characterize the order properties. 


Alternative Order Axioms 


There exists a set P of positive elements in the set of all real numbers such 
that: | 


I. The Trichotomy Law. If a is any real number, then one and only one of 
the following is true: 


(a) ac P, 

(b) a = 0, 

(c) —a E P. 
2. The Addition Law. If a and b are in P, then (a + b) e P. 
3. The Multiplication Law. If a and b are in P, then a:b EP. 


These two sets of possible axioms are related by the assertion that 
a < bif and only if b — a is positive. Indeed, with this as the definition 
it is not difficult to prove these last three laws from the first set or con- 
versely to prove the first set from these three. As an example, we can 
prove 


Theorem 1-12. The transitive law is a consequence of the three alterna- 
tive order axioms. 


Proof: Supposea < bandb < c. This means that (b — a) and (c — b) 
are positive. Then the second axiom tells us that 


b —a)+ (c—b)=c—a 


is positive, and hence a < c. 

From a theoretical point of view, a mathematician who is interested 
in abstract mathematics might prefer to use the second set of axioms to 
specify the properties of order, but the first set is probably of more practical 
use, and is a little easier to work with in the actual application of the order 
properties. 

The fact that these two sets of axioms are equivalent and that either 
set can be used may be surprising to the student. He should realize that 
axioms are not intrinsically determined, but are chosen to accomplish a 
specific purpose. 
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PROBLEMS 


1. Prove Theorems 1-4 through 1-11. Each may be used in the proof of any 
subsequent ones. The proofs should not make use of the alternative order 
axioms. The following hints may prove useful. 


Theorem 1—4: a+ (—a) = 0. Use Axiom 1. 

Theorem 1-7: Use the previous theorems. 

Theorem 1-9: Use Axiom 4 and Theorem 1-7. 

Theorem I-11: Prove in two parts. First prove that ifa > 0, thena-! > 0. 
Try applying Axiom 1 to a—!. 


2. Prove the alternative axioms from the first set. The truth of Theorems 1-3 
through 1-11 may be assumed. 


3. Prove the first set of order axioms from the alternative set. One has already 
been done for you (Theorem 1-12). Be careful not to use anything that is not 
given or has not yet been proved, starting from the alternative set. 


4. Ifa < b,isa—! > b-!? What ifa < 0? Whatifb < 0? 


5. Prove that the field of Problem 3 in the last section cannot be ordered. 
[Hint: 0 and 1 are different numbers. Use the Trichotomy law, assuming that 
the field can be ordered. Add 1 to each side of the assumed order relation.] 


1-4 THE COMPLETENESS AXIOM 


The field axioms and the order axioms of the last two sections still do 
not characterize the real numbers completely. The set of all rational 
numbers (which can be represented in the form p/q, where p and q ¥ 0 
are integers) satisfies the field axioms and the order axioms, but not all 
real numbers are rational numbers. This fact was known to the early 
Greek mathematicians. Indeed, it is considered probable that the existence 
of such irrational numbers was one of the closely guarded secrets of the 
Pythagoreans. 

To the Greek mathematicians, geometric facts were of primary im- 
portance, and in their view, numbers were closely related to the lengths 
of line segments. Let us try to see how this point of view operates. Imagine 
a straight line (extended indefinitely in both directions) and mark two dis- 
tinct points on this line. Label one of these points O and the other 1. 
The line segment between these two points is taken as our unit length. 
By the construction methods of euclidean geometry this unit length can 
be laid out successively along the line to give us points we can label 2, 3, 
and so on. 

If q is any positive integer, other construction methods allowed by 
euclidean geometry can be used to subdivide each of these segments of 
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unit length into g segments of equal length. Doing this, we in effect 
construct a ruler having the initially given line segment as its unit distance, 
and with marks located at a distance 1/q apart. 

In this way we see that by the euclidean “straightedge and compass” 
constructions we can find a line segment whose length is any (positive) 
rational number. | 

But we can also construct a line segment of length v2. If a square is 
erected with its base as the line segment between the points labeled 0 
and 1, then by the Pythagorean theorem, the diagonal of this square is 
of length ./2. This length can be transferred to the given line (Fig. 1-1) 
to yield a point which we can label \/2. However, »/2 is not a rational 
number, as we will now prove. 





FIGURE 1-1 
0 1 v2 2 


Suppose, on the contrary, that there exist two integers p and q such 
that V2 = p/q. Then 2 = p?/q’, or 


p? = 2q?. 


The square of any odd number must be an odd number. This follows 
since any odd number is of the form 2k + 1, where k is an integer, and 
its square is of the form 4k? + 4k + 1, which leaves a remainder of one 
when divided by 2. Each of the numbers p and q contains some factors of 
two so that 

p = 2°k, 


q = 2", 


where n and m are nonnegative integers (either could be zero) and k and 
j are odd numbers. Then p? = 2?”k? and hence is an odd number times 
an even number of factors of 2. However, p? = 2g? = 2-27")? — Q2mtlj2 
is at the same time an odd number times an odd number of factors of 2, 
which is impossible. The same number cannot contain both an even 
and an odd number of factors of 2. 

We see therefore that there are points on the “measuring line” that we 
constructed which are not at a rational distance from the origin, or equiv- 
alently, that there are real numbers which are not rational numbers. Thus 
there must be still another property satisfied by the set of all real numbers 
which we have not yet listed. The missing property is the completeness 
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property. This property can be explained intuitively as stating that if we 
assign numbers to the points on a line as described above, then every 
point on the line corresponds to a real number. 

Before making a formal statement of the completeness axiom, we 
should remark about the way in which irrational numbers are used. First, 
consider v2. If you were asked, “What is the square root of two?,” 
what would your answer be? If it is 1.414 or the like, you are wrong. The 
only correct answer which can be given to this question is “The real 
number whose square is two.” Any other must be wrong. The square of 
1.414, for example, is 1.999396, which is not 2. 

Of course the student who says that the square root of 2 is 1.414 might 
argue that he meant 1.414..., the three dots indicating additional 
decimal places which could be specified if necessary. This would be a 
semicorrect answer (if given this completely). It would be a correct 
answer if an explanation were also given of how the missing decimal 
places could be determined. 

As another illustration, what is the value of m? The numbers 44 or 
3.1416 are commonly used in place of r, but they are of course not the 
same thing. The number 7 is also an irrational number and its decimal 
expansion goes on indefinitely. However, this is immaterial in actual 
practice. Many handbooks list ten or twelve decimal places of the value 
of m. How accurate is such a value? The circumference of a circle one 
meter in diameter is determined to within 10~!° meter by using 10 decimal 
places of m. This is one angstrom unit and is of the order of the distance 
between atoms in ordinary matter. It would hardly be of any practical 
concern to know the value of 7 any more accurately than this. 

However, no finite number of decimal places suffice to determine such 
irrational numbers exactly. Even if the difference is too small to matter 
in any practical situation, it is still there. With regard to rational num- 
bers, the situation is different. Many rational numbers also have non- 
terminating decimal expansions (e.g., 4 = 0.33333...), but a group of 
digits repeats in these expansions, and it is always possible to determine 
the actual number being represented. 

When we write 7 = 3.14159 ..., what do we mean? Isn’t it true that 
we mean that the rational number 3.14 = 23% is smaller than the real 
number r while 3.15 is larger than 7? That 3.141 < m < 3.142, and so 
on? Thus an infinite decimal expansion is nothing more than a sequence 
of rational numbers. The longer the expansion is carried out, the closer is 
its value to the desired number in terms of the decimal approximation 
from below. 

With this as a background, we now give the final axiom for the real 
number system. In order to do so, however, we must first define some of 
the terms used in the axiom. 
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Definition 1-2. A set of real numbers A is said to be bounded above if 
there exists a number m such that m > z for every z in the set A. 
The number m in this case is called an upper bound of the set A. 


Definition 1-3. A number m is called the least upper bound of a set of 
real numbers A if and only if 


(1) m is an upper bound of the set A, 

and 
(2) if k is any real number which is less than m, then there is some 
x in the set A such that z > k. 


THE COMPLETENESS AXIOM: If A is any set of real numbers which is 
bounded above, then there exists a real number m which is the least upper 
bound of A. 


We are not going to make formal use of this property of the real numbers 
in this book. We list it here only because it is the last of the axioms for 
the real number system. It so happens that the field axioms, the order 
axioms, and this single completeness axiom characterize the real number 
system completely. Students who go on to take advanced mathematics 
courses will probably see a proof of this fact. 

In this book, we are only interested in the point of view, expressed 
above, that this axiom requires a correspondence between all the points 
on a line and the real numbers. Before leaving this section we would like 
to emphasize this point and formalize the discussion of the coordinates on 
a line. 

When numbers were associated with the points on a line as described 
above, we considered only the points to the “right” of the point labeled 
zero. We can extend the labeling to the “left” of zero as well. Clearly, 
such points must correspond to the negative real numbers. When this 
is done, however, we find that we have a point on the line associated with 
every real number. 


Definition 1-4. A coordinate line (or coordinate axis) is a line together 
with an association between the real numbers and the points on the 
line so that each point corresponds to a unique real number, called the 
coordinate of that point, and each real number corresponds to a unique 
point. Furthermore, this assignment of coordinates with points must be 
such that the distance between two points of the line (in terms of a 
particular unit of distance) is the difference of the coordinates of these 
points. The point associated with the real number zero is called the origin 
of the coordinate line. 
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Note that there is a tacit assumption of the existence of a concept of 
distance in the plane. Almost any suitable axiom scheme for euclidean 
geometry will make such a distance concept available. 

Suppose that instead of two points on the line to start with, we were 
given a single point, which is to be the origin, and a unit distance (say 
as the length of an entirely separate line segment). Then there are two 
distinct ways in which the line can be turned into a coordinate line. There 
are two points on the line a unit distance from the origin, and either of 
these could be labeled 1 (the other would then be —1), giving us two 
distinct coordinate lines with the same origin and the same unit of dis- 
tance. 


PROBLEMS 


1. Give an example of a set of numbers which is not bounded above. 


2. (a) Given A, a set of numbers which is bounded above, exactly what would 
you mean by saying that the number m is the mazimum of the set 1? 
(b) If m is the maximum of A, how would m be related to the least upper 
bound of A? 
(c) Does every set A which is bounded above have a maximum? 


3. A method used in many high school courses for approximating the square 
root of a number a is as follows: let x be an approximation to the square 
root of a. Then 

_ a +a 
22 


is a closer approximation to a. For the following problems, let a = 2. 


(a) Show that if x is a rational number, then y is a rational number. 
(b) Compute y? — 2 in terms of z. 

(c) Show that ifz > v2, then y > v2. 

(d) Show that if z > v2, then 


y? — 2 = (z? — 2)K, 
where0 < K < 4. What does this say about the closeness of the approxima- 
tion of y to V2? Can you improve this result if z is quite close to V2? 


4. Among all numbers of the form p/q, where p and q are integers and 0 < gq < 10, 
which is the closest to 2? 


5. Let a and 6b be two real numbers with a < b. Prove that if c = (a + b), 
thena < c < b. 


6. Using the results of Problem 5, prove that there is no largest negative number 
(that is, that the set of negative numbers does not have a maximum). [Hint: 
Assume that there is a maximum and arrive at a contradiction. | 
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1-5 ABSOLUTE VALUE 


Let us open this section with a simple question: 
Let a be a real number. Is —a positive or negative? 


The student should answer this question for himself before reading on. 
Many students will answer it incorrectly unless they pause to think about 
it a moment. 

Did you say, or was your first thought that —a is negative? This is 
the usual incorrect answer. What makes it negative? Where was it said 
that a is positive? If a is a negative number, then —a must be positive. 
Actually of course this is an improper question. As worded, the correct 
answer would have to be that —a may be positive, negative, or neither 
(the last if a = 0). 

The point here is that after years of experience in seeing numbers of the 
type —2, —3, and recognizing them as negative, many students gain a 
deep-rooted feeling that a negative sign indicates a negative number. Yet 
this is true only if the quantity behind the negative sign is positive. 

A negative sign does not indicate that a number is negative. Some of the 
difficulty comes from the common habit of reading —a as “negative a.” 
This is wrong. It should be read as “the negative of a.” This small dis- 
tinction is quite important. 

Now, if the student has the above comments firmly in mind, we can 
proceed to give a definition of the absolute value. 


Definition 1-5. Let a be a real number. Then the absolute value of a 
is the real number |a| defined by 


la| = a if a> 0 
= —a if a < 0. 


Various other explanations of this same property are possible. For 
example, we could define 
la| = va, 


where the square root symbol is understood (as always) to mean the 
positive square root. This definition is, however, less elementary and 
more difficult to work with. 

The absolute value of a real number is frequently described as the 
“nonnegative magnitude of the number.” This phrase is descriptive but 
too vague for use as a definition. It is sometimes also described as “the 
distance of the point having that coordinate from the origin on a coordinate 
line.” Again, this is a descriptive phrase that leaves much to be desired 
in terms of usability. 
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Another possible definition would be: the number |a| is the maximum of 
the two numbers a and —a. This definition is as good as the one given 
above. In fact, it is probably better from a theoretical point of view. 
However, as will be seen, the use of the definition given is probably 
easier. Proofs of results are not so short as they could be, but are easier 
to discover, since Definition 1-5 clearly calls for the breaking down of the 
problem into the various possible cases. 

Let us list a few simple properties of the absolute value which can be 
proved directly from the definition. 


Theorem 1-13. For any real number a, |a| > 0. |a| = 0 if and only if 
= 0. 


Proof: There are actually three separate results to be proved here. First, 
we must prove that if a is any real number, then |a| > 0. This result is 
obvious upon examination of the two possible cases in the definition. 

Next, the theorem makes an “if and only if” statement. To prove this 
we must prove the two separate statements: 


(1) ja] = 0 if a = 0. 
(2) |a| = 0 only if a = 0. 


Here, however, statement (1) is equivalent to the statement: 


(1’) If a = 0, then ja| = O, 
and this too is obvious from the definition. 
Statement (2) is equivalent to: 


(2’) If |a| = 0, then a = 0. 


We can prove this by considering the various possible cases. 

Suppose |a| = 0. Then there are only three possibilities: a > 0,a < 0, 
ora = 0. If a> 0, then |a| = a > 0, which violates our supposition. 
Ifa < 0, then |a| = —a > 0, which also violates the assumption. Hence 
a = QO, since this is the only remaining possibility. 


Theorem 1-14. For any real number a, 


—lal < a < [al. 


Proof: If a > 0, then |a| = a; therefore the conclusion of the theorem 


is true. If a < 0, then |a| > 0, and hence —|a| = a < 0 < |a| (see 
Theorems 1-4 and 1-8). 
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Theorem I-15. For any real numbers a and b, 
|ab| = |a| - |b}. 


Proof: This can be proved by the most direct method possible, that is, 
by considering the four possible cases. 


Case I.a > 0,b > 0. Then ab > 0 and |a| = a, |b| = b, |ab| = ab; 
hence the theorem is true. 


Case II. a > 0,6 < 0. Then ab < 0 and |a| = a, |b| = —b, |ab| = 
—ab. But in this case, |a| - |b] = a(—b) = —ab = |abl, and hence the 
theorem is again true. 


The remaining two cases are left as an exercise. 
Theorem 1-16. If a? < b?, then |a| < |b]. 


Proof: Suppose a? < b?. Then b? — a? > 0. However, we know 
(Problem 1) that b? = |b|? and a? = |a|?, and therefore |b|? — |a|? > 0. 
The expression on the left-hand side of this inequality can be factored to 
give 


(ibl — lal) - (b| + lal) > 0. 


Now |b| + |a| > 0 (why?), and hence the first factor cannot be negative 
or zero. Therefore |b| — [a] > 0, which is equivalent to the conclusion 
of the theorem. 

These results demonstrate how the definition can be used to give rigorous 
proofs for the properties of the absolute value. 

There is one more theorem which we wish to give in this section. This 
is a major result which will be used throughout the rest of the student’s 
study of mathematics. In fact, it is the cornerstone of most proofs in 
calculus, and probably deserves to be called the fundamental theorem of 
analysis (although it never is). This result is the triangle inequality. The 
reason for this name is not obvious here, but will appear when we study 
vectors. 


Theorem 1-17. (The Triangle Inequality) For any real numbers a and b, 


la + b| < la| + Jbl. 
Proof: We note that 
la + b|? = (a + b)? 
= a? + 2ab + b? 
= ja|? + 2ab + |b|?. 
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Using the fact that ab < |ab| = |a| |b|, we therefore have 
ja + b|? < al? + 2la| |b] + |b]? 
= (|a| + |b)’. 


Theorem 1-16 then shows that this result implies the conclusion of the 
theorem. To see this, observe that what we have actually proved is 


(la +b)? < (la] + |b])?. 
Putting this into Theorem 1-16 as a hypothesis gives us as a conclusion 
(ja + bh| < |(la] + |b). 


However, |(la + b|)| = Ja +b| and [la| + [bD] = lal + [6] (why?), 
which shows that we have completed the proof of the theorem. 

We should remark on the usage of notations exhibited in this proof. 
When several lines are shown linked by equalities or inequalities we are 
to think of the expressions as being written sequentially on one line. Thus, 
for example, the lines 


la + b|? = jal? + 2ab + |b]? 
la|? + 2ļa]| lb] + |b]? 
= (|a| + |b)? 


lA 


would be read 
la + b|? = |a|? + 2ab + |b|? < Jal? + 2la| |b] + |b]? = (lal + |b)’, 


from which we conclude that |a + b|? < (lal + |bl)?. 

In similar displays, some writers prefer to think of the top left-hand 
expression as being understood as the left-hand member of each line. To 
do so in the above display, the last equality would have to be replaced 
by the previous inequality. This, however, would interfere with the 
clarity of the relationship between successive lines. 


Theorem 1-18. Let p be some positive number. Then |a| < p if and 
only if —p < a < P. 


Proof: Here again we have an “if and only if” theorem and hence must 
prove two results. 

First, suppose that |a| < p. We wish to prove that this implies that 
—p < a < p. There are two possible cases. Ifa > 0, thena = |a| < p, 
and since at the same time we have —p < 0 < a, these two facts to- 
gether give us —p < a < p. On the other hand, if a < 0, then we have 
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—a = |a| < p. From this we conclude, a > —p. Putting this together 
with a < 0 < p gives the desired result —p < a < p. 

Next, to prove the “if” part of the theorem, let us suppose that 
—p <a <p. We wish to prove from this assumption that |a| < p. 
However, if a > 0, then ja| = a < p, while if a < 0, then |a| = 
—a < p (since —p < a implies that —a < p). Thus, in either case, 
we have the desired result. 

An example of the use of this theorem might be of interest. Suppose 
that we are given that x satisfies [2x — 1| < 9. From the above theorem 
we can conclude that —9 < 2x — 1 < 9. What is more, since the theorem 
said “if and only if,” the set of all x which satisfy this last relation is the 
same as the set satisfying the first. Since we can add the same amount to 
both sides of an inequality, the last relation is equivalent to —8 < 2x7 < 10, 
which in turn is equivalent to —4 < x < 5. Thus we see that 


{xz | [2x — 1| < 9} = {xr|—4 < z < 5}. 


PROBLEMS 


1. Prove from the definition of the absolute value that |a|? = a? for any real 
number a. 


2. Complete the proof of Theorem 1-15. 

3. Prove the converse of Theorem 1-16. That is, prove that if |a| < |b|, then 
a? < b?. 

4. In the proof of the triangle inequality, the inequality occurs at only one 
point. Under what conditions will |a + b| = |a| + |d|? 

5. Prove the triangle inequality directly by considering the four cases: 


Case I. a> Oandb> 0; 
Case II. a>0,b < 0,anda+ b> 0; 
Case III. a > 0,6 < 0, anda+b < Q; 
Case IV. a < Oandb < 0. 


Why is it sufficient to consider only these cases? 
6. Prove that |z| > b if and only if < —borz > b. 


7. Find numbers u and v for each of the following parts such that the given set 
is equal to {x |u < x < v}. Show these sets on a coordinate line. 
(a) {x||z — 2] < 7} (b) {x | |z — 5] < 3} 
(c) {x||z+ 4] < $ (d) {x | |8x — 5] < 5} 
(e) {x| |5a + 3] < 8} 


8. Show each of the following sets on a coordinate line. 
(a) {x||z — 3] > 5} (b) {z || + 4| > 4} (c) {z | |2z — 4 > 3} 
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In this section determinants of the second and third order will be dis- 
cussed. Such determinants will be useful at various places in the re- 
mainder of the text in writing certain formulas in especially compact and 
easily remembered form. No attempt will be made to prove the existence 
of determinants of other orders or to give a rigorous discussion of de- 
terminants in general. At a later stage, when the required concepts have 
become familiar, it will be easy to come back and make a complete study 
of determinants. For our present needs, the discussion in this section will 
be enough. 


Definition 1-6. A determinant is a real-valued function of a square array 
of numbers. The value of a determinant of order two is defined as 


Qa, a2 


b bo = abe = dob, 








and the value of a determinant of order three is defined as 


Qa, Q2 a3 
bı be 63) = aybecg + agb3c; + agbyc2 — agbecy — azbıc3 — aıbzc2ə. 
C1 Co 63 


The important thing to note about this definition is that in the expan- 
sion of the determinant, each term contains exactly one factor from each 
row and exactly one from each column. Furthermore, there is exactly one 
term for each possible combination. Thus the determinant of order three 
has six terms in its expansion, since there are three ways of choosing an ele- 
ment from the first row, and when this has been done, there remain two 
ways of choosing an element from the second row which is not in the 
column already used. After these two elements have been chosen, there 
is only one element in the third row which can be used: 

A similar situation holds for determinants of higher orders. Thus, for 
example, the expansion of a determinant of order four will contain twenty- 
four terms (why?). The only difficulty in defining determinants of ar- 
bitrary orders is in giving a rule for the determination of the sign to be 
attached to each term. 

If we group the terms in the expansion of the determinant of order three 
properly and factor out a,, @2, and a3, we find 


Qa; @q a3 
bı b2 b3| = aı(b2c3 — b3c2) — ae(bicg — bgci) + ag(bicg — becy). 
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Comparing the terms in parentheses with the definition of a determinant 
of order two, we see that we have proved: 


Theorem 1-19. 


Qa; Q2 a3 
bı b2 b3| = ay 
Cı C2: C3 


be bg 
C2 C3 


bı bg 
Ci, C3 


+a 




















Cy C2 


Note that the determinants of order two on the right-hand side of the 
equality in this theorem are obtained from the original determinant by 
deleting the top row and one of the columns. In fact, we delete the row 
and the column which contain the factor ai, a2, or a3 which we then use 
to multiply the resulting smaller determinant. The fact that the middle 
term in this expansion takes a negative sign is something which must be 
remembered. 

We may now proceed to prove a number of properties of determinants. 
We will state these results as theorems without reference to the order of 
the determinant since they are actually true for determinants of any order. 


The proofs given here, however, apply only to the cases of order two and 
three. 


Theorem 1-20. If the rows and columns are interchanged in a de- 
terminant, the value of the determinant is unchanged. 


Remarks. For the case of order three, this says, for example, that 


ad, dg Q3 a, bi c 
bı b2 b3| = |az2 b2 Co): 
Cı Co C3 az b3 C3 


This result can be proved by direct expansion of each of the determinants 
involved. This proof is trivial for a determinant of order two and not much 
more difficult in the case of order three. The student is invited to give the 
proofs as one of the problems at the end of this section. 

The square array of numbers which results after interchanging the rows 
and columns in this way is usually called the transpose of the original array. 
Theorem 1-20 could thus be restated in the form: the determinant of an 
array 1s equal to the determinant of the transpose of that array. 


Theorem 1-2]. If in a determinant two rows (or columns) are inter- 
changed, the value of the determinant is changed in sign. 
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Remarks. In the case of a determinant of order three, an example of 
this fact is 


Qı aq a3 Qa; Q2 Q3 
bı b2 b3| = — |cı Ce c3|’ 
Cı C2 C3 bi bo bs 


It suffices to prove this theorem for either rows or columns. The remain- 
der of the result would then follow by using Theorem 1-20. Thus, for ex- 
ample, if Theorem 1-21 had been proved for columns, its proof for rows 
would proceed as below: 

Let A be a square array of numbers and let A’ be the transpose of this 
array. Let B be the array which results from the interchange of two rows 
of A, and let B’ be the transpose of B. Then it is clear that we can obtain 
B’ also by interchanging two columns of A’. Hence, writing |A| for the 
determinant of A, we have 


|B| = |B'| = —|A’"| = —|4]. 


To prove this theorem for the interchange of columns of a determinant 
of order two is trivial. In the case of a determinant of order three, we can 
use the decomposition given in Theorem 1-19 to help us. The student 
should try writing out this decomposition for cases where the first and 
second columns have been interchanged and when the first and last 
columns have been interchanged to see how the proof may be accomplished. 


Theorem 1-22. If two rows (columns) in a determinant are identical, 
the value of the determinant is zero. 


Proof: Merely apply the previous theorem. If the two identical rows 
are interchanged, then on the one hand the value of the determinant is un- 
changed and on the other hand is changed in sign. The only number which 
is its own negative is zero. 


Theorem 1-23. If all of the entries in a row (column) of a determinant 
are multiplied by a constant k, then the value of the determinant is also 
multiplied by this constant. 


Proof: This says, for example, that 


a; ag 43 ai a2 a3 
kb; kbo kbs == k bı bo b3 ’ 
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allowing us to factor a constant out of a row or a column. The proof is 
obviously trivial if the constant k multiplies the elements of the first row 
and if we look at the decomposition given by Theorem 1-19. To prove it 
for another row, we may use Theorem 1-21 as follows. Suppose we are 
given a square array of numbers A. Let B be the square array which results 
when the tth row of A is multiplied by k (i = 1). Let A* be the square 
array obtained by interchanging the first and the ith rows of A, and let 
B* be the array. resulting from the interchange of the first and ith rows 
of B. Then B* can be obtained from A* by multiplying the top row of 
A* by k. Therefore, 


|B] = —|B*| = —k|A*| = klAl. 


The proof for columns is obtained by using Theorem 1-20 in a similar 
manner. 


Theorem 1-24. Let two determinants of the same order be identical 
except in one given row (column). Then the sum of the values of the 
determinants is the value of the determinant with the common rows 
(columns) and the sum of the corresponding elements in the remaining 
row (column). 


Proof: An example may help to make this clearer. This theorem asserts 
that 


ay ae a3 a; ag a3 Qa, a2 a3 
bı +dı b2 +d2 b3 + d3| = |bı b2 b3|+ļ|dı d2 dz|- 
C1 C2 C3 Ci Co C3 Cı C2 C3 


The proof (using Theorem 1-19) is again quite simple if it is the top 
row which is different in the two determinants. It may then be done for 
an arbitrary row by interchanging rows just as in the proof of Theorem 
1-23, and then for columns by using Theorem 1-20. 

Actually, this theorem depends only on the fact that in the expansion 
of the determinant, each term contains one and only one (linear) factor 
from each row and column. 

The next property of determinants is of considerable value in practice, 
since it yields a method for the simplification of determinants. 


Theorem 1-25. In a given determinant, a constant multiple of the 
elements in one row (column) may be added to the elements of another 
row (column) without changing the value of the determinant. 
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Proof: That is, for example, 


a, ao a3 Qa; Q2 a3 
bı + kaı b2 + kaz b3 + kaz3| = |b, be bg}: 
Cı C2 C3 Ci Co C3 


The proof follows from the previous theorem. The value of the deter- 
minant on the left-hand side of the above is equal to 


Qa; ag a3 ay a2 a3 
bı bo b3 + kaı kav kaz : 
Cy C2 C3 C1 C2 C3 


By Theorem 1-23, the constant k in the second row of the last determinant 
can be factored out, and then Theorem 1-22 shows that the value of this 
determinant is zero. 

The final result of this section is of both theoretical and practical im- 
portance. In fact, it could be used to define higher-order determinants. 


Definition 1-7. The minor of an element in a determinant is the de- 
terminant of lower order which results from the deletion of the row 
and the column containing that element. 


For example, the minor of the element bz in the determinant 


Qa, Q2 Q3 
by be bz 
C1 Co C3 
is the determinant 
a, a2 p 
C1 C2 








Definition 1-8. The cofactor of an element in a determinant is (—1)° 
times the value of the minor of that element, where s = i + j, the 
given element being in the th row (counting from the top) and the 
jth column (counting from the left). 


Thus, for example, the cofactor of the element b3 in the above example 


Qa, a2 


—1)2+3 
(—1) fe o 








The factor (—1)* is +1 for the element in the upper left-hand corner and 
is alternately +1 and —1 in a checkerboard pattern throughout the de- 
terminant. 
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Theorem 1-26. The value of a determinant is equal to the sum of the 
products of the elements of a given row (column) and their cofactors. 


Proof: Two examples of this expansion are 


Qi; Q2 a3 




















bo b3 bı bg bi be 
by bo b3 = aj == +a 
Cg C3 Ci Cg Ci C2 
Ci Co Cg 
bı bg a, a3 a, a3 
= —Ao + bo | — Ce ° 
C1 C3 Cı C3 bı bg 




















The result of this theorem applied to the top row (the first line in the 
above example) is exactly the statement of Theorem 1-19. We can prove 
it for the second row by interchanging rows one and two. This changes 
the sign of the determinant and each cofactor in the expansion will have 
its sign changed. To prove it for the third row (having proved it for the 
second), interchange rows two and three. 

Note how simple this result is if all of the elements except one in a given 
row (or column) is zero. All determinants can be reduced to this form 
by the application of Theorem 1-25. Thus, for example, 


L a e t $= 
2 4 5=| 0 —2 19 
—3 1 15| |-3 1 15 
1 3 -7 
=| 0 -2 19 
0 10 —6 

_ |-2 19 

~ | 10 —6 

_ |-2 19 

=| 0 89 


= —178. 


Here, in the first step, the top row was multiplied by 2 and subtracted 
from the second row. Next, the first row was multiplied by 3 and added 
to the third row (in practice these two steps could be done simultaneously 
to save writing). Then the determinant was expanded by the cofactors 
of the first column, only one term appearing because of the two zeros. 
The resulting second-order determinant could be expanded directly, or, 
as was done, the top row multiplied by 5 and added to the second row to 
give a simpler expansion. 
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At some future time the student will see a proper treatment of deter- 
minants of an arbitrary order. For the present, let us just state that the 
theorems listed above are true for determinants of any order. In par- 
ticular, the theorem on expansion by cofactors could be used to define 
a determinant of nth order in terms of determinants of (n — 1)th order. 
With this type of definition, these theorems could all be proved. However, 
Theorem 1-20 would be quite difficult (and this is the main reason that 
we prefer to defer a complete discussion of determinants of an arbitrary 
order). 

If a higher-order determinant must be evaluated, use the methods dis- 
cussed above to reduce the order. Successive reductions of order will 
eventually lead to a second- or third-order determinant which can be 
evaluated. 


PROBLEMS 


1. Evaluate each of the following determinants, first by the definition, and then 
by the reduction technique illustrated at the end of this section. 






























































(a) |2 —3 5 (b) |1 1 0 
1 6 2 01 1 
6 2 —5 1 0 ł 
(c) 138 4. 5 (d) |15 30 —15 
4 5 6 30 —80 70 
5 6 7 28 14 —35 
2. Evaluate the following determinants: 
(a) |1 5 —7 (b) |3 —1 1 
2 10 1 0 5 —l 
3 16 150 2 18 —3 
(c) |7 1 —2 (d) |1 2 3 
5 1 3 23 4 
1 —1 —18 3.4 5 
3. For what values of x do the following determinants have a value of zero? 
(a) 3 —l1 2 (b) |1 5 —1l 
x 5 0 1 3 x 
—6 2 x 1 x 3 
(c) |—x 2 1 (d) |x 2r —rzr 
4 1-2 0 1 0 3 
4 —2 3— 2 o —l x 














4. Prove Theorem 1-20 for determinants of order three. 
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5. Prove Theorem 1-21 for the interchange of the first and second columns. 
6. Prove Theorem 1-21 for the interchange of the first and third columns. 
7. Expand the determinants in Problem 1 by the cofactors of the second column. 


8. Define a fourth-order determinant by 





Qa21 a2 Q3 Q4 
By Io De Bl corel ces el sles oe 
a a 1 2 h i 2 f: = 
d dedy a 2 Q3 Q4 1 Q3 4 
bı b2 b4 bı be bg 
+ a3|cı c2 cal] — a4|cı ce c3l. 
dı d2 d4 dı d2 dg 











Prove Theorem 1-21 (for columns) for a fourth-order determinant. 


9. Using the reduction method illustrated in the text, evaluate 


3 14 4 10 

1 —6 —2 —5|. 
—1 3 1 0 

0 10 3 5 


2 


Analytic Geometry 
and Trigonometry 


2-1 THE CARTESIAN PLANE 


Suppose that we are given a unit of distance in the plane. This unit of 
distance can be used on any line in the plane to turn that line into a 
coordinate line. Thus we can measure the distance between any two 
points of the plane by supposing that a line has been drawn which passes 
through these two points and then using the unit of distance to measure 
the distance. 


y-axis 


z-axis 


FIGURE 2-1 





Suppose also that we are given two straight lines which intersect at right 
angles in the plane. We arbitrarily assign a sense of direction to each, and 
make each into a coordinate axis by making the point of intersection of 
the two lines the origin of the coordinates on each. One line is called the 
x-axis and the other the y-axis. While this can be done in a completely 
arbitrary fashion, we make a conventional choice of the orientation of 
these two axes for the purposes of illustration. This choice is as shown 
in Fig. 2-1. The z-axis is horizontal, with the positive direction being to 
the right. The y-axis is vertical, the positive direction being upward. 

The plane of these two lines is called the cartesian plane. If a point P 
is given in the plane, unique lines parallel to the two axes can be drawn 
through this point. These lines will intersect the axes at a pair of well- 
determined points. Let the point at which the line parallel to the y-axis 

33 
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cuts the z-axis have coordinate x, (on the z-axis) and let the correspond- 
ing point on the y-axis have coordinate yı. Then the pair of numbers 
(zı, Y1) are called the coordinates of the point P. The coordinates are 
written in the order shown. The first number, xı in this case, is called the 
x-coordinate of the point, and the second is called the y-coordinate of the 
point. 

Suppose conversely that an ordered pair of real numbers is given. The 
first will determine a unique point on the z-axis and the second a unique 
point on the y-axis. Through these points, perpendiculars can be drawn 
to the corresponding axes. These will intersect at a unique point. 

In other words, each point in the cartesian plane determines a unique 
ordered pair of real numbers, and each ordered pair of rcal numbers de- 
termines a unique point. There is thus a one-to-one correspondence be- 
tween the points and the pairs of real numbers. Thus, by common usage 
we will often speak of the point (z1, y1), meaning the point with coordi- 
nates (z1, Y1). 

The particular point (0,0) at which the coordinate axes intersect is 
called the origin of the coordinates, and hence also the origin of the plane. 

In the next chapter, careful definitions will be given, and we will be able 
to prove many things while knowing exactly the foundation on which we 
are building. Here, however, we will assume a knowledge of the geometry 
of the plane upon which we superimpose the cartesian plane. One of the 
things we assume is the Pythagorean theorem. This allows us to de- 
termine the distance between two points of the cartesian plane. 


P 
iN a al Se -$ (Xo, Y2) 


l 
| 
| 
Hekaa oan eC 
Pilz y1) i 
l 


FIGURE 2-2 





Suppose points P, and P> are given in the plane with coordinates 
(x1, Y1) and (£2, Y2) respectively. Let the lines through P, and Po parallel 
to the z- and y-axes intersect at C, as shown in Fig. 2-2. Then the triangle 
whose vertices are P1, P2, and C is a right triangle with its right angle at C. 
The length of the side P,C is |x; — zə| (as measured on the z-axis) and 
the side PC has length |yı — y2|. From the Pythagorean theorem, we 
see that the distance between P, and P» is 


[(£ı — z2)? + (yı — Y2)”. 
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Definition 2-1. The distance between two points P,; and Pz» in the 
cartesian plane is denoted by |P;P%]. 


Thus we have shown that if Pı = (z1, y1) and P2 = (£2, Y2), then 
|PiP2| = [(t1 — z2)? + (yı — z2)?]"?. (2-1) 


For example, if A and B are the points with coordinates (3, —4) and 
(1, 2) respectively, then 


|AB| = [3 — 1)? + (—4 — 2)7]"” 
= [27 + (—6)7]' 
= [4 + 36]"? 
= v40. 


Now suppose a positive real number R and a point Po with coordinates 
(£o, Yo) are given. What are the points which lie on the circle with radius 
R and center Po? Suppose P is such a point. Then |PPo| = R. Hence if 
P has coordinates (z,y), then [(z — x9)? + (y — yo) ?] l? = R, or 
equivalently, 

(x — 2)? + (Y — yo)? = R?. (2-2) 


Conversely, suppose that the coordinates of some point P = (x,y) 
satisfy the relation (2-2). Comparing the equation with (2-1), we see that 
|PPo|? = R?, or equivalently, |PPo| = R. That is, the point P is at a 
distance R from the point Po, but this means that the point P is on the 
circle of radius R with center Po. Thus we have shown that the set of 
points on this circle is exactly the same as the set of points whose co- 
ordinates satisfy Eq. (2-2). Putting this into the form of a theorem, 
we have 


Theorem 2-1. The circle with radius R and center Po = (£o, Yo) in 
the cartesian plane is 


{(x, y) | (£ — zo)? + (y — Yo)? = R3}. (2-3) 


An important point must be noted here. We see in (2-3) that a circle 
is a set of points which satisfy a certain equation. This notation is com- 
pletely accurate but rather cumbersome. For this reason, a very common 
usage eliminates any mention of the set and we find phrases such as “the 
circle (x — zo)? + (y — yo)? = R?.” Such a phrase is clearly incorrect, 
since the circle is the set of points which satisfy this equation and not the 
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equation itself. However, this is not a serious objection since the meaning 
of the phrase could not be misunderstood. 

In mathematics accuracy of thinking is essential. It is safest to be 
careful and always use accurate language. When the student is aware of 
the exact meaning needed, he may be allowed to use shorter terminology 
and phrases which are not quite precise, provided that the meaning is not 
lost or altered. 


Definition 2-2. The set of all points (x, y) in the cartesian plane which 
satisfy a given equation in the variables x and y is called the locus of 
that equation, and the equation is called the equation of that locus. 
Two equations are called equivalent if they have the same locus. * 


Thus, the locus of Eq. (2-2) is a circle. In using the terminology of this 
definition, the word “locus” is sometimes replaced by the name applied 
to that point set. In particular, we would say, for example, that (2-2) is 
the equation of a circle. 

For example, the circle with center (2, —1) and radius 3 would have the 
equation 

m= 2) SAE 
or 


(z — 2)? + (y+ 1)? = 9. 


Let us look again at the equation of a circle, (2-2). This equation is 
equivalent to the equation 


x? + y? — 2xox — 2yoy + (x + yo — R?) = 0, 
that is, to an equation of the form 
xz? + y* — 2ax — 2by +c = Q. (2-4) 


Thus the equation of the circle with center (2, —1) and radius 3, given 
above, becomes 
t? — 4r +4 +y? +2y+1=9 
or 
xr? + y? — 4r + 2y—4=0 


when transformed to the form (2—4). 


? 


* The phrase “truth set” or “solution set” is often used instead of “locus’ 
in modern texts. The meaning is the same. We use “locus” here because it 
is in such common use that the student should be aware of it. Besides, it is shorter. 


2-1 THE CARTESIAN PLANE 37 


Is any equation of the form (2-4) the equation of a circle? The answer 
is clearly no, since, for example, the equation 


xr? + y? — 27 —2y+4=0 


is equivalent to 
@= 1? y= 1)? = =2, 


and no points in the plane would have coordinates which could satisfy 
this equation. The sum of two squares cannot be negative. 

Given an equation of the form (2-4), it is easy enough to tell whether or 
not it is the equation of a circle and, if it is, to identify the circle. All that 
needs to be done is to complete the square in both x and y. For example, 
given the equation 


x? + y? — 62 + 14y + 33 = O, 
we would proceed by the following steps 


xr? — 6x + y? + 14y = —33, 

2? — 62 + 9+ y? + l4y + 49 = —33 + 9 + 49, 
(z — 3)? + (y+ 7)? = 25, 
(z — 3)? + u +7? = 5. 


This last equation can immediately be identified from (2-3) as the equa- 
tion of the circle of radius 5 with center at the point (3, —7). Note how 
the numbers appearing in this form are the negative of the coordinates of the 
center. 

Suppose we were asked to find the circle which passes through three 
given points, say the points (0, —4), (—5, 1), and (4, 4). We know that 
the equation of any circle can be brought into the form (2-4), and hence 
if we can find a, b, and c in this equation we can find the circle. To do this, 
all that needs to be done is to put the values of x and y for the given points 
into (2-4) and solve the resulting set of equations for a, b, and c. 

When the coordinates of the three points given above are put into 
(2-4), we have 

16 + 8b +c = 0, 


26 + 10a — 2b + c = 0, 
32 — 8a — 8b + c = 0, 


or equivalently, 
8b + c = — 16, 


10a — 2b + c = —26, (2-5) 
8a + 8b — c = 32. 


This set of equations can then be solved for a, b, and c. The resulting 
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values can be used to find the center and the radius of the circle in the 
manner described above. 


Similarly, by assuming an equation for the circle with unknown param- 
eters, we could find the equations for circles with other given conditions. 


PROBLEMS 


1. Make a sketch locating the following points in the cartesian plane: 


(a) A = (1, 5) (b) B = 6, 1) 
(c) C = (—6, 2) (d) D = (3, —5) 
(e) E = (—2, —7) (f) F = (2,7) 
(g) D = (—2, 7) (h) H = (2, —7) 


2. If a and b are two nonzero real numbers, what is the relationship between the 
points (a, b) and (—a, b)? between the points (a, b) and (a, —b)? between 
the points (a, b) and (—a, —b)? What happens to these relationships if 
a or b is zero? 


3. Using the points A through H of Problem 1, find the distances 


(a) |AB| (b) | AC| 
(c) [EF] (d) jFG| 
(e) |FH| (f) |DE| 
4. Using the points in Problem 1, find the distances: 
(a) |AE| (b) |AF| 
(c) |CG| (d) |BC| 


5. Write the equations of the circles with the following centers and radii, both 
in the form (2-2) and in the form (2-4). Make a sketch showing the circle. 
(a) Center (1, 2); radius 4 
(b) Center (3, 4); radius 5 
(c) Center (—5, 3); radius 1 


6. Follow the instructions given for Problem 5 for the following circles: 
(a) Center (0, 2); radius 2 
(b) Center (6, —2); radius 6 
(c) Center (—2, —2); radius 8 


7. Identify whether or not the following are equations of circles. If they are, 
give the centers and the radii. 
(a) £? + y2? + 2r — 4y —4 = 0 
(b) z? + y2? — 20y + 84 = 0 
(c) z? + y2? — 6r — 2y + 14 = 0 


8. Follow the same instructions as in Problem 7. 


(a) z? + y? + 22 — 3y +1 = 0 
(b) x? + y2 + 7z — 8y +3 = 0 (c) 322+ 3y? + 4r + 18y +7 = 0 
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9. What is the locus of the equation 
(x — xo)? + (y — yo)? = 0, 
where ro and yo are specified real numbers? 
10. Solve equations (2-5) and find the center and the radius of the circle. 


11. Find the equations of the circles satisfying the conditions below, and give 
the centers and the radii. 
(a) The circle passes through the points (0, 0), (8, 1), (7, 0). 
(b) The circle passes through the points (9, 1), (8, —4), (1, 138). 
(c) The circle passes through the points (0, 4), (0, —2), (4, 2). 


12. Find the equations of the circles of radius 10 which pass through the points 
(—4, 0) and (12,0). How many of them are there? Make a sketch. 


13. Find the equation of the circle with center at (3, —7) which passes through 
the point (6, 2). 


2-2 STRAIGHT LINES 


In Chapter 4 we will consider the problem of defining exactly what 
is meant by a straight line. In this section we will assume that we know 
what a straight line is, and concentrate on its properties. 

First, let us consider a very special case. Suppose L is a straight line 
in the cartesian plane which is parallel to the y-axis. Then by the very 
way in which we introduced the coordinates of a point, we see that every 
point on this line has the same x-coordinate, namely the coordinate of the 
point at which this line cuts the x-axis. Furthermore, every point which 
has this value for the x-coordinate is on the line. Thus, for each point 
(x, y) on the line, we must have 


r=, (2-6) 


where c is the coordinate of the point on the x-axis at which the given line 
crosses. We therefore see that we have proved: 


Theorem 2-2. L is a straight line in the cartesian plane which is parallel 
to the y-axis if and only if there is some real number c such that 


L = {(z,y)|z2 = e}. 


Note that this statement is the same (using our conventions) as saying 
that the equation of L is x = c. Students sometimes find it difficult to 
think of x = c as defining a set of points in the plane since y does not appear 
in this equation; but if we recall that this is just a short way of stating the 
set relation in this theorem, there should be no such diffculty. 
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Now if L is any line in the cartesian plane which is not parallel to the 
y-axis, then it must cut every line parallel to the y-axis at a single point. 
That is, given any real number Zo, there will be one and only one point 
on the line with zo as its x-coordinate. 

Let b be the coordinate of the point at which the line cuts the y-axis, 
and draw the line parallel to the z-axis through the point (0,b). Ina 
similar manner as above, we see that this line could be identified as the 
line whose equation is y = b. Next, if we choose any c + 0, and also 
draw in the line x = c, we will have formed a right triangle. All such 
triangles have a common angle at the point (0, b) and hence are similar. 

In fact, if we take any two points (71, y,) and (19, y2) on the line which 
are such that xı < zə and draw in the lines y = yı and z = zə, we will 
have formed a right triangle which is also similar to any of the above 
triangles (see Fig. 2-3). 


(£2, Y2) 





FiGuRE 2-3 


Let us fix a particular triangle as the one to refer all of the others to. 
We will use the triangle determined by the given line and the lines y = b 
and x = 1. The length of the base of this triangle is 1. Let the height 
of this triangle be |m|, where the sign of m is so chosen that the point 
(1,b + m) is on the line. Thus, m is positive if the line “rises,” as does 
the line in Fig. 2-3, and m is negative if the line “falls.” For this triangle, 
the ratio of the height to the base is |m|/1 = |m]. 

On the other hand, for the triangle determined by the points (zı, yı) 
and (z2, Y2), the base is of length (xə — xı) and the height is |y2 — yıl 
(why?). The fact that this triangle is similar to the triangle fixed above 
means that 


ly2 — yıl , 


ee — (v2 — 2) 


Note, however, that m has been chosen to be positive or negative so 
that it has the same sign as (yz — y};) in this case when we are assuming 
tı < Tə. 
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We can therefore conclude: 


Theorem 2-3. If L is a straight line which is not parallel to the y-axis, 
then there exists a real number m such that if (xı, yı) and (z2, Y2) are 
any two distinct points of the line, then 

Y2— Yi. 

tə — Tı 


m = 


As stated, this theorem does not require that zı < z2. The student 
is asked to verify that this restriction is not necessary in one of the problems 
at the end of this section. 

Let us apply this theorem to the point (0, b) at which the line cuts the 
y-axis and a general point (x, y), x # 0, on the line. The result of this 
theorem gives 





‘This equation is easily seen to be equivalent to the equation 
y = mz + b (2-7) 


except when x = 0. We see that the point (0, b) satisfies (2-7), however. 
Therefore, every point on the line satisfies (2-7). Hence we have 


Theorem 2-4. If L is a straight line in the cartesian plane, not parallel 
to the y-axis, then there are real numbers b and m such that 


L = {(z,y) ly = mz + b}. 


Conversely, the locus of an equation of the form (2-7) is a straight line 
not parallel to the y-axis. 


Strictly speaking, the last half of this theorem has not been proved. 
However, exactly the same discussion as above can be used. to show that 
if a point (x,y) satisfies this equation, then (x, y) must lie on the line 
through the points (0, b) and (1, b + m). 


Definition 2-3. If the line L has equation y = mz + 5, then b is called 
the y-intercept of L and m is called the slope of L. 


Equation (2-7) is called the slope-intercept form of the equation of a 
line. A line which is parallel to the y-axis is often said to have infinite 
slope (see Problem 3), but it is more correct to say that it has no slope. 
Note that the lines for which m = 0, that is, those with equation y = b, 
are parallel to the x-axis. 
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Equations (2-7) and (2-6) can be combined into a single case. Either 
one can be written in the form 


ax + by + c = 0, (2-8) 


where not both a and b are zero. Equation (2-8) is called the general 
form of the equation of a line. We see that if b = 0 in an equation of this 
form, we can divide through by a and get the equation of a line of the 
type (2-6). If b ¥ O, then we can divide by b and get the slope-intercept 
form of an equation of a line. In this case, the slope ism = —a/b. 

The student will find it essential to be able to find the equation of a line 
satisfying given conditions. There are two types of conditions which 
appear very frequently in practice, and which must therefore be familiar 
to the student. These conditions are: a pair of points which are to be on 
the line, and a given point and slope. Let us look at this second condition 
first. 

If the slope m is given, we know that the line will have an equation of 
the form 

y = mz + b, 


and hence all that needs to be determined is the correct value for b. Sup- 
pose that the point (xı, yı) is given and is to be on the line. Then the 
coordinates must satisfy the equation, giving 


yi = mz, + 4, 
and hence b = yı — mzı. The required equation is therefore 
y = mz + (yı — mzı). (2-9) 


This equation can be put into a slightly different form which is often useful. 
Equation (2-9) is equivalent to 


y— yi = mx — tı), (2-10) 


which is called the poznt-slope form of the equation of a line. Note that 
each side of the equation is zero at the point (21, y;). 

The point-slope form of the equation could also be derived directly 
from Theorem 2-3. Indeed, Eq. (2-10) is equivalent to 


m = 241, (2-11) 

t — T1 
except for the point (xı, y1) which is on the line and satisfies (2-10) but 
not (2-11). Equation (2-11) can be derived immediately from Theorem 2-3. 


This result is easily remembered, particularly in the form (2-11), and 
the student is advised to be sure he learns it, since it is used quite often. 
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Let us look at an example of the use of this result. What is the equation 
of the line with slope 4 which passes through the point (1, —3)? From 
(2-10) we have 

y — 1 = (z + 3). 


We can find an equivalent equation in the form (2-8) from the above 
equation. Such an equation is 


xr — 2y +5 = 0. 


Next, let us consider the problem of finding the equation of the line 
passing through two given points. This can be done in several ways. We 
could assume the general form 


ar + by + c= 0 


and use the coordinates of the two given points to obtain the pair of 
equations 
axı + byı + c = 0, azz + bya + c = 0. 


These two equations contain three unknowns, but they can be solved 
nonetheless. The method is to eliminate one of the unknowns, leaving a 
single equation in two unknowns. This can be solved for one of the un- 
knowns in terms of an assumed value of the other. A solution set can be 
obtained for each such assumed value. But the different solution sets 
for different. assumed values are multiples of each other. This does not 
matter, since the locus of Eq. (2-8) is unchanged when all of the coefficients 
are multiplied by the same nonzero constant. 

For example, let us find the equation of the line passing through the 
points (1,2) and (4,4). Assuming the equation ax + by + c = 0, we 
find that at (1, 2) 

a+2b+c= 0, 
and that at (4, 4) 
4a + 4b + c= 0. 


Subtracting the first of these equations from the second, we find 
3a + 2b = 0. 


We now assume any convenient value for a and solve for b. The value 
a = 2 1s useful here, since it makes b = —3 (an integer). These values 
can now be put into one of the equations given above. In particular, 
from the first equation, 

—a — 2b 

—2 +6 

= 4, 


C 
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The desired equation is therefore 
2r — 3y+4= 0. 


The method described above is not the only one which can be used to 
find an equation of the line through two points. The results of Theorem 2-3 
can also be utilized. 

If the two given points have the same x-coordinate, but different y- 
coordinates, then the line is parallel to the y-axis and must have the equa- 
tion + = zı (xı being the common z-coordinate). Let us suppose that 
for the two given points zı ¥ Zo, and solve for the point-slope form of 
the equation. All we need is the slope, since we know a point already. 
From Theorem 2-3 we have the slope 


m = L=, (2-12) 
T2 — Tı 
and putting this into the point-slope form of the equation as given by 
(2-10), using (xı, yı) as the point, gives 


_ (yz — 91) 


Y — Yı = Ga) (x — zı). 


This result can be written in several equivalent forms, two of which are: 


(y — yı) a (x — 21) 
(y2 — yı) (z2 — xı) 
(x2 — xı)(y — Yı) = (y2 — y1)(X — zı). (2-13) 


The student should check to see that the two given points actually 
satisfy these equations. 

Both are called the point-point forms of the equation of the line. The 
first is easier to remember, but the second is of greater generality, since it 
remains valid even when x, = 2p. 

In practice, most students find it easier to determine the slope first, 
using (2-12), and then to use the point-slope form (2-10) rather than trying 
to memorize formulas (2-13). For example, to find the line through the 
points (1, 3) and (5, —5), we first find the slope 


—5—3 —8 
ee ae ed 
and then using (2-10) we have the equation 


y — 3 = —2(x — 1), 
or, equivalently, 
2r +y —5=0Q. 
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PROBLEMS 


. Explain why Theorem 2-3 holds even if r2 < 2. 


. Letc be a fixed real number. Find the equation of the line with slope m which 
cuts the z-axis at the point x = c. Write this equation in the general form 
(2-8) with a = 1. What happens to this equation if m becomes increasingly 
large? 


. Find the equation of the line which passes through the two points (a, 0) and 
(0, b), where neither a nor b is zero. These points are called the intercepts 
of the line. Make a sketch showing these points and how the line is deter- 
mined. Show that this equation can be brought into the form 


Bo Tee 
ad 


. Find the equation of the line with the given slope, passing through the given 
point. Make a sketch showing the line. Give the equation in the general form 
(2-8), and in the slope-intercept form (2-7). 

(a) Slope 2, point (7,3) (b) Slope —1, point (1, —1) 

(c) Slope 5, point (0, 10) (d) Slope 4, point (0, 10) 

(e) Slope —%, point (—4, 5) 


. Follow the same directions as in Problem 4. 

(a) Slope —4, point (3, —8) (b) Slope 50, point (1, 0) 
(c) Slope —z'5, point (1, 0) (d) Slope —50, point (1, 0) 
(e) Slope 1, point (0, —1) 


. Find the equation of the line passing through the given points. Give the 
equation in the general form (2-8), and give the slope of the line. Make a 
sketch. 

(e) (—3, 5), (7, 5) 

. Follow the directions of Problem 6. 

(c) (4, —8), (8, —4) (d) (—2, —7) (6, —7) 

(e) (7, 32), (8, —62) 


. Show that the equation of the line through the points (71, y1) and (z2, y2) is 
given by 








zx y 1 
41 yi 1) = 0. 
T2 y2 l 


Can you turn this into a formula which can be used to determine whether or 
not three points are all on the same line? 
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2-3 FUNCTIONS AND GRAPHS 


We have mentioned functions a few times already, expecting that the stu- 
dent would have no trouble understanding what was meant, but in this 
section we will give the formal definition of a function and try to make 
clear how we think about and work with the function concept. 

Up until quite recently (within the last century), when functions were 
mentioned in the mathematical literature they were usually considered 
to be formulas, that is, algebraic expressions which could be written down 
more or less explicitly, involving one or more variables. By the turn of 
the century it had become obvious that this concept was too narrow and 
that a better understanding of a function was needed. 

In particular, it was realized that functions could not be restricted to 
having arguments and values among Just the real and complex numbers. 
In fact, it seemed that there should be no such restrictions at all. The 
elements of any set had to be allowable. Eventually what was arrived 
at was the following formal definition. 


Definition 2-4. Let D and R be any two sets. A function with domain 
D and range R is a set F of ordered pairs (x, y) with the properties: 


(1) For every (x,y) E F,x E€ DandyER, 

(2) For every x € D there is one and only one y E€ R such that 
(z, y) E F. 

The set D is called the domain of the function and the set 


{y | (x, y) E F for some z € D} 


is called the image of the function. 


This formal definition becomes necessary for certain difficult problems. 
But in general it is too clumsy for actual use in the ordinary situation. In 
fact, it is usually better to think of functions as some “rule” that asso- 
ciates to each element of the set D a unique element of the set R. 

Note that in the above definition, the range can be any set which con- 
tains the image. Strictly speaking we should not speak of the range, since 
this is not unique. The concept of range is useful when considering a 
function in which it is difficult or impossible to determine the exact image. 
In such a case, the range is defined to be the smallest set which we are 
sure contains the image. For example, we might have a function in which 
the domain was the set of all people in the world and which associated 
to each person the number of hairs on his head. We certainly cannot de- 
termine the image of this function, but we know that the range is a 
subset of the nonnegative integers. We could say that the range is the 
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set of all real numbers, or the smaller set of all integers, but these sets 
are clearly too big. Can you put an upper limit on the integers in the 
range? 

A special notation is used for functions. We will usually denote the 
function by a letter and when we are talking about functions in general, 
we will usually use the letter f. The way in which we use the functional 
notation is explained by the following definition: 


Definition 2-5. If the function f is a set F of ordered pairs as defined 
above, then by the value of the function at x € D we mean the element 
y E R such that (x, y) e F. This value will be denoted by f(z). 


We sometimes will speak of “the function f(x)” instead of “the function 
f.” This happens especially when the function can be defined by a simple 
formula, such as 


f(x) = x”. 


This is a function which associates to each real number z the value 2”, 
and is therefore the set of ordered pairs {(x, x”) | x is a real number}. We 
will say that this is the function x”, even though z? is really the value of 
the function at xz and not the function itself. 

The logical confusion in letting f(z) represent both the function and 
the value of the function never causes difficulty in any normal context. 
Essentially, it merely amounts to thinking of x as being a variable point 
in D and thinking of f(x) then as representing the set of all pairs (z, f(z)) 
in the above definition. 

Although most of the functions we shall consider will have real values 
and will be defined on some subset of the real numbers, it is worthwhile to 
give a few examples showing other types of functions. 

In a library, every book is assigned `a call number. This gives rise to a 
function which assigns a call number to each book. We might write this 
function as C(x). The domain of this function is the set of all books in 
the library. The x in C(x) thus ranges over all of these books. The value 
of C(x) is a call number, and the image of C(x) is the set of all call numbers 
of the books. 

Let P be the set of all people in the United States and let D be the 
collection of all subsets of P. Then we can define a function on D by 
specifying the value of the function to be the number of people in each 
subset. This is a type of function which is of interest to the census bureau, 
although they only consider certain special subsets, such as the set of all 
men, the set of all residents of a given state, the set of all unemployed, 
etc. The image of this function would be the set of all integers from zero 
to the total population of the United States. 
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Every rational number can be written in the form p/q, where p is an 
integer and q is a positive integer so that p and q have no factors in com- 
mon (we say p/q is in lowest terms). For every rational number z, we 
can define a function f(x) to be f(x) = 1/q, where x = p/q in lowest 
terms. What is the image of this function? 

For a final example of a nonstandard type of function, suppose that on 
the domain of all real numbers we define 


ia) = 0 if x is rational, 
1 if x is irrational. 


What is the image of this function? Is the function completely defined? 
Next, we wish to define the graph of a function. 


Definition 2-6. The graph of a function f(x) with domain D is the set 
of all ordered pairs (x, f(x)) where x € D. 


A comparison of Definition 2—4 with this definition seems to show that 
the function and its graph are the same thing. Each is the same set of 
ordered pairs. This is true, but we actually distinguish between the two 
concepts. 

In the first place, we do not normally think of the function as the set of 
ordered pairs. We usually think of it as the association (or mapping) 
between the elements of the domain and the elements of the range. 

However, even when we use the formal definitions, we still distinguish 
between the function and the graph. The function is considered to be the 
set of ordered pairs F, while the graph is considered to be the same set F, 
thought of as a subset of the set of all ordered pairs (x, y) with x in the 
domain and y in the range of F. 

This becomes easier to see when we have a function whose domain is 
the set of real numbers (or a subset of the set of real numbers) and whose 
range is the set of real numbers. The ordered pairs of the graph are 
ordered pairs of real numbers, and hence can be identified with points 
of the cartesian plane. The graph will then be a point set in the cartesian 
plane. For each value of x, we can mark the point (x, f(x)) on the car- 
tesian plane and obtain a picture of the graph of the function. 

Not every point set in the cartesian plane is the graph of a function. 
The requirements that a point set be the graph of a function can be deduced 
from Definition 2—4 and will serve to clarify the meaning we assign to the 
term function. For each real number c which is in the domain of the 
function f(x), there is a unique real number f(c) such that (c, f(c)) is 
on the graph. This means that for each c in the domain, the line z = c 
cuts the graph at one and only one point. 
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Thus, for example, Fig. 2-4 shows the 
locus of the equation 


a ppt 
LY, 


but it is not the graph of a function whose 
domain is a subset of the z-axis. There is 
an obvious function (or rather two such 
functions) associated with this locus, how- 
ever. The function defined by 


y = vz 
has as its domain the set of all nonnegative real numbers and as its image 
the set of all nonnegative real numbers. What is its graph? 


Note that when we think of the graph of a function in the cartesian 
plane, we usually write the function in the form 


y = f(x), 


since we are thinking of a point set of ordered pairs (x, y) in our standard 
notation. 

As another example, we show in Fig. 
2-5 the graph of the function 


FIGURE 2-4 


y = |zl. 


What is the domain of this function? 
What is the image of this function? 

There are two special types of functions FIGURE 2-5 
which occur frequently enough to deserve 
special comment. These are the polynomial functions and the rational 
functions. 

A polynomial function is one whose value at each z is given by a linear 
combination of powers of x. That is, a polynomial function, p(x), is 
defined by some finite number, n + 1, of real numbers ado, a1,..., Gn 
such that for every z, 


p(x) = ao + aiz + azz? + -+ + anr”. 


We will assume that the student is already well acquainted with the basic 
properties of polynomials, and turn to rational functions. 

A rational function is a function whose value at each z is given by the 
quotient of two polynomials. The domain of a rational function is there- 
fore the set of all real numbers, less those real numbers for which the 
denominator is zero. We will assume that the student is familiar with the 
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graphs of polynomial functions, but we will make a few remarks about 
the problems which arise in sketching the graphs of rational functions. 
This may best be done by showing how a particular example can be 
analyzed. 

Let us sketch the graph of the function 


E = æ—1  (æ—l1i(z+1) 
PO) eg 6 ae eer 


The first thing we do is (as shown above) factor the numerator and the 
denominator. It is then clear that the domain of this function is the set 
of all real numbers less the numbers 3 and —2. The zeros of the numerator 
tell us that the function is zero at 1 and —1 (and only at these values). 

The function cannot change sign between any of these values for zx. 
This fact involves questions of continuity which we cannot discuss here, 
but the student may accept it without question. To determine the sign 
of the function f(z) between these values, we merely need to calculate 
values of the function at intermediate points. These values will help us 
make the sketch later. In this example, we find 


Next, we determine the behavior of the function near the zeros of the 
denominator. Suppose that x is near but smaller than —2 in this example. 









- = — — — — -N 


FIGURE 2-6 
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Then (x — 1) is near —3, (x + 1) is near —1, and (x — 3) is near —5. 
Therefore f(z) must be near 
— 3 e 
5(x + 2) 


Since x is smaller than —2, (2 + 2) is negative. What happens if x gets 
very close to —2? The factor (2 + 2) stays negative, but gets close to 
zero, and hence |f(x)| must become very large. That is, “f(x) gets close 
to oo.” A common way of writing this down is 


f(x) —~ +0 as zr > —2 (x < —2). 


This is a purely symbolic statement which we interpret as saying that the 
values of f(x) for x very close to, but less than, —2 are “near +o”; 
that is, they are very large in absolute value and are positive. For this 
example we see that 


f(z) ~ +0 as z — —2 (x < —2), 
f(z) ~ —o as x —2 (x > —2), 
f(z) — —æ as x—3(a < 8), 
f(z) ~ +o as r—3 (x > 3). 


Note that in order to write the above expressions we do not need to give 
all of the analysis which has been carried out. All we need to do is to note 
what the sign of f(x) is near the point in question; and we determine this 
sign by calculating intermediate values. 

Finally, we would like to determine the behavior of f(z) as x becomes 
very large in absolute value. We do this by dividing the numerator and 
denominator through by the highest power of x in f(x). In this case 


— wl 1-1/2? 
IO ae 6 Pe Oe 


As |x| becomes increasingly large, 1/z, 1/x?, and 6/z? become very 
close to zero, and hence f(z) gets close to 1. 

With all of the above information available, we can now make a sketch 
of the graph of the function. First, we draw dashed lines x = —2,27 = 3, 
and y = 1. These lines are called the asymptotes of the function, since 
the graph gets very close to these lines at distances far from the origin. 
Next, we plot the points that have been determined and sketch in a 
smooth graph which makes use of all of the information on the behavior 
of the function that we have been able to determine. For this example, 
we obtain the graph shown in Fig. 2-6. 
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PROBLEMS 


1. Under what conditions is a straight line in the cartesian plane the graph of a 
function? If it is the graph of a function, what formula will give the values 
of the function? 


2. The set of points on the circle whose equation is 
z? + y? = R2 
for which y > (is the graph of a function. What is a formula for f(z)? What 
is the domain of this function? What is the image of this function? 


3. What is the domain of the function defined by f(x) = 1/z on the real num- 
bers? What is its image? Sketch the graph of this function. 


4. If f(x) is a function whose domain is some subset of the real numbers and 
whose range is in the set of real numbers, when is 1/f(z) a function and what 
is its domain? 

5. Sketch the graph of each of the following functions, whose domains are the 
real numbers or subsets of the real numbers: 

(a) fz) =1+2 (b) f(z) = xz + Izl 

(c) f(z) = [16 — z?) (d) f(z) = 4 — [16 — z?]!? 
_ JO if z is rational 

er da) = 1 if z is irrational 


6. Sketch the graph of each of the following functions: 








(a) f(z) = 2° — 424+ 3 (b) f(z) = (£ — 1)(2” — 4) 
(c) f(@) = @+ 1)@? +1) d = FF 
(æ —1) o (-—1’ 
© JO = G— OIO S r 
_ («+ 3)@ — 1) _ ea + 1)(@ + 2) 
(8) J@) = Cae 42) b) $2) = Ge + De —1) 
() fa) = 2 — D a =- EID 
ed eu eT | ) I@) = Te —DEetrD 


2-4 TRANSLATIONS 


The euclidean concept of translation is connected with the idea of a 
“rigid motion,” that is, the moving of the points of the plane so that the 
distance between any two points is the same after the motion as it was 
before. The euclidean rigid motions are translation, rotation, and reflec- 
tion. The particular feature that distinguishes a translation from one of 
the other rigid motions is that every point is moved in the same direction 
and through the same distance. 
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Definition 2-7. A euclidean rigid motion of the cartesian plane is a 
function m(P) whose domain and image are the set of all points in the 
cartesian plane and which preserves the distance between points; i.e., 
for any two points A and B, if A’ = m(A) and B’ = m(B), then 


|A’B'| = |AB|. 


A translation is a rigid motion in which the distance between any point 
and its image is always the same. 





FIGURE 2-7 


Suppose that a translation carries the origin, O, to the point O’ with 
coordinates (h, k). We do not prove it here, but it follows from the above 
definition that the points on a line are mapped by the translation to the 
points of a line parallel to the original one. Then if the point P is mapped 
to the point P’, we see as in Fig. 2-7 that the broken lines through O’ 
are the images of the axes and that the dashed lines through P’ are the 
images of the dashed lines through P. All of these lines are parallel to one 
or the other of the axes, and we conclude that P’ has been moved a hori- 
zontal distance h and a vertical distance k from P. That is: 


Theorem 2-5. If the points of the cartesian plane are translated so that 
the point which is at the origin goes to the point with coordinates (h, k), 
then a point with coordinates (x, y) is translated to the point (2’, y’), 
where 
z=2t+h, 
y =ytk.* (2-14) 
Note that under a translation, every point of the plane is moved, but 
that the coordinate axes remain fixed. We “slide” the plane along under 


the axes. Formula (2-14) gives us the relationship between the coordinates 
of the point before and after the translation. It is sometimes useful to 


* This theorem could just as well be made the definition of a translation. 
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write this in the following manner: 


(z, y) —> (x',y’) 

(0,0) — (h,k), 

(a,b) — (a+th,b+ hk), 
(1, —1) —> (1 + h, —1 + k). 


Suppose now that we are given a locus in the plane. For a concrete 
example, consider a circle with radius R and center O, that is, the locus of 


xz? + y? = R?. (2-15) 


We know that a translation which carries the point at the origin to the 
point with coordinates (h, k) will carry this circle to a circle of the same 
radius, but with center at the point (h, k). How could this be determined 
directly from Eq. (2-15)? If a point (x, y) is on the original circle, x and y 
satisfy (2-15). The point (z’, y’) = (tc + h,y + h) is on the translated 
circle. From (2-14) x = x’ — handy = y’ — k, and hence the translated 
point (z’, y’) must satisfy the equation 


(x! — h)? + (y' — k)? = R°, (2-16) 


which we recognize as the equation of the translated circle. Looking at 
this closely, we see that in general: 


Theorem 2-6. Let C be the locus of an equation 
f(z, y) = 0, 


and let C’ be the set obtained from C by a translation which carries 
the point at the origin to the point with coordinates (h, k); then C” is 
the locus of the equation 


f(x — h, y — k) = 0. 


In the second expression of this theorem we have used x’ and y’ as the 
variables to emphasize the relations (2-14) and the fact that we are as- 
suming the points of C’ to have coordinates (x', y’). In practice, after 
making this transformation, we would drop the primes, leaving the equa- 
tion of the translated locus in usual form. 

Note that the identification obtained above can also be read from: 


(z, y) > (x' y’), 
(z, y) > (xz +h, y+ k), 
(z£ — h, y' — k) > (x, y’). 
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In the last line, the coordinates of the point on the right are the same as 
the coordinates on the right of the top line. Therefore, the corresponding 
coordinates on the left must be the same also. When (z, y) satisfies a given 
equation, we therefore must have the pair of numbers (z’ — h, y’ — k) 
satisfying the same equation. 

The concept of translation can be used in several different ways. First, 
if we are given the equation of a locus, we can ask for the equation of the 
translated locus. This was the problem solved by the above theorem. 
Suppose, to give a specific example, we have the locus 


K = {(z,y) |y = 827}, 
and we wish to make the translation which sends the point at the origin 
to the point with coordinates (1, —3). Then 
(0, 0) T (1, —3), 
(x,y) > (x, y’), 
(z, y) -R (x + 1, y = 3), 
(& — 1,y' + 3) > (x, y’), 


so that the locus K translates to 


K’ = (@,y') |y +3 = 80 — 1)? 
= .{(z, y) |y + 3 = 8(z — 1)7}. (2-17) 
Sometimes we would like to find the equation of a locus whose equation 
we would know if the locus were properly positioned. For example, what 
is the equation of the locus consisting of the point of intersection of the 


two lines y = x + 1 and y = —2x-+ 8, together with all points on the 
two lines “above” this point? That is, the locus 


A= {@a,y)|y=x+lory = —2+3,andy > 2}. 


In the last section, we saw that if this locus is translated so that the point 
of intersection is at the origin, then the translated locus would have the 
equation y’ = |x’|; so we write 


(zy) > @,y’), 
(1,2) — (0, 0), 
(x, y) R (x = l, y — 2), 


and obtain the desired equation 


A= {(z,y)| y— 2 = |x — 1}. 
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Finally, we can use translations to simplify equations. Having the locus 
E = {(z,y) |4@@ — 1)? + 8y + 5)? = 12}, 
for example, we can introduce the translation 
yy’), (@,y)>@—1,y+ 5), 
so that z’ = x — 1, andy’ = y+ 5. Then, under this translation, 
E’ = {(2', y’) | 42’? + 3y’? = 12}. 


We still may not know what this new locus is, but at least it has a simpler 
equation. 


PROBLEMS 


1. Why are the two sets in (2-17) the same? 


2. Suppose that we have translations T and T’ such that T takes (z, y) —> 

(x+ h, y+ k) and T’ takes (z, y) > (1 + k'y + k’). 

(a) Is the result of translating the plane by translation T and then translating 
the resulting points by translation T’ itself a translation? What happens 
to the coordinates of a point under these circumstances? 

(b) Are translations commutative? That is, is the result of T followed by T’ 
the same as T’ followed by T? (Be careful here.) 

(c) Given T, does there always exist a T’ which is the inverse of T; that is, 
such that T followed by T” returns all points to where they started from? 
If so, what is it? 


3. Sketch the graphs of 


(a) y = |z — 3| (b) y = |z + 4| — 2 
(ce) y +2 = vz— 1 (d) y = |4z — 8| — 3 


4. Find the equation of the locus resulting from the translation of 
D = {(z, y) | |z| + ly] = 


in such a way that the point which is at the origin moves to the point whose 
coordinates are (2, —4). 


2-5 ANGLES 


Euclid defined an angle to be the inclination of one line to another. Since 
inclination is undefined, this is not a definition at all. A study of Euclid’s 
proofs shows, however, that he considered an angle to be the geometric 
configuration of two intersecting lines, with two angles being equal if the 
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two geometric configurations are congruent (i.e., if they can be made to 
coincide by means of rigid motions). 

While this form of the definition of an angle was sufficient for Euclid’s 
work, it has been found necessary to use a more complex definition in 
modern mathematics. Given two intersecting lines L and L’, we must be 
able to distinguish between the angle from L to L’ and the angle from L’ 
to L, and we must be able to assign real number values to angles. In 
this section we will try to make these ideas precise. 


Definition 2~8. Let P and Q be two distinct points of the plane and let 
L be the line through P and Q. Make L a coordinate axis by letting P 
be the origin and letting Q have a positive coordinate. Then the ray 
from P through Q is the set of all points on L which have nonnegative 
coordinates. We call this the ray PQ. 


Note that it does not matter what unit of length is used in defining the 
coordinates on the line. Only the direction chosen for the positive co- 
ordinates matters. For a given line through a point P there are two rays, 
one for each way of choosing a sense of direction on L. From a given point 
there are infinitely many possible rays, two for every line through P. 


Definition 2-9. A geometric angle is a pair of rays originating from the 
same point. An oriented geometric angle is an ordered pair of rays 
originating from the same point. If PR and PQ, in that order, are the 
pair of rays making up an oriented geometric angle, then the ray PR 
is called the initial side and the ray PQ is called the terminal side of the 
oriented geometric angle RPQ. 


In this definition the two rays may coincide or may lie in opposite 
directions along the same line. This conflicts with the definition used in 
many geometry courses, but we will find it useful not to have to dis- 
tinguish these special cases. Note that a given geometric angle determines 
two oriented geometric angles, depending upon which of the two rays we 
call the initial side of the angle. 

Before we continue, we must make some observations about orientation 
in the plane. The two rays consisting of the points of the z- and y-axes 
with nonnegative coordinates divide the plane into two regions, one of 
which is “three times as large” as the other. We can get from the z-axis 
to the y-axis by moving along the unit circle (the circle with radius one 
and center at the origin) in two distinct ways. One of these ways is shorter 
than the other, the shorter path being in the counterclockwise direction. 
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As far as the mathematics goes, it is unimportant whether the shorter 
path from the z-axis to the y-axis is counterclockwise or clockwise, but it 
is important that we are able to fix one such sense of rotation and refer 
to it as needed. Observe that the assignment of an orientation in this way 
depends on our being “on one side” of the plane. If we have an angle formed 
by two rays in space, we could find a plane containing these rays, but we 
cannot decide whether to call a rotation from one ray to the other clock- 
wise or counterclockwise unless we know which side of the plane we 
observe it from. 

Let us assume that we have fixed an orientation in the cartesian plane 
so that we can speak of a “clockwise” or “counterclockwise” rotation in 
this plane; the orientation is to be so chosen that the counterclockwise 
route is the shorter one in going from the positive x-axis to the positive 
y-axis. 

The concept of angle which we wish to develop is to be independent of 
translation and rotation (but not of reflection), and hence it will suffice to 
consider only angles at the origin which are such that the initial side is 
the positive portion of the z-axis (Fig. 2-8). Note that in Fig. 2-8 we 
show an arrow on an arc between the two rays to indicate which is the 
terminal side of the angle. 

Since we work with the real numbers, we would like to assign real num- 
ber values to angles. This has, of course, been done since the earliest times: 
The Babylonians measured angles by dividing a full circle into 360 equal 
parts and we still use their system when we measure angles in degrees. 
In military usage, angles are measured in mils, which are defined to be 
saoo Of a full circle. This particular system has a computational simplicity 
which is useful in the particular applications made of it. 

It would appear that we are at liberty to assign almost any desired 
unit to a system of angular measurement. However, for mathematical 
purposes, one particular method of measuring angles turns out to be the 
most valuable. This is the so called radian measure of angles, which 
assigns the value 27 to the full circle. 

It is convenient to have angles with all possible real numbers as their 
values, but it is clear that any pair of rays could have an angular measure- 
ment only between 0 and 27 if we assign the value 27 to the full circle. 
We avoid this and other difficulties by defining the configuration of rays 
which corresponds to a given numerical value of an angle rather than by 
defining the numerical value of an angle defined by a pair of rays. We 
will do this by means of another undefined concept, that of arc length along 
the circumference of a circle. 

When we say that we want angles to be independent of euclidean motion, 
this implies an ability to subdivide angles (dividing the whole circle into 
360 degrees, for example). We can imagine a method of subdividing the 
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circle, using the euclidean notion of congruence, so as to assign a rational 
multiple of the entire length of the circle to any arc on the circle. Just 
as the rational numbers can be completed to the reals, we could then 
complete these arc length measurements so as to be able to measure any 
arc. Conversely, we will also assume that given any real number we can 
measure off a circular arc of that length. To do so, we must make an 
agreement about what to do with negative numbers. 

To fix our thoughts, let us use the unit circle; that is, the circle centered 
at the origin whose radius is one. Let the point P be the point (1, 0) on 
this circle (Fig. 2-9). The entire length of circumference of this circle is 
27 (by the definition of 7). Just as we could measure off coordinates on a 
line, we now assume we can measure off coordinates on the circle, starting 
at P and proceeding in the counterclockwise direction for positive co- 
ordinates. Thus, for example, the point A in Fig. 2-9 would correspond 
to the coordinate +1 while the point B would correspond (as shown) to 
the coordinate —2. 


FIGURE 2-8 FIGURE 2-9 


The point (0, 1) corresponds to the coordinate 1/2 but also to the co- 
ordinate —37/2 (why?). Each real number would give us only one point, 
but each point will have many coordinates. The point P in particular has 
the coordinate zero, but since the total length of the circumference of the 
circle is 277, it also has the coordinates 27, —27, 47, —47, etc. Indeed, 
it is easily seen that if any point has a coordinate a, then it has coordinates 
a+ 27k for k = 0, +1, +2,... The coordinates of a given point all 
differ by integral multiples of 27. 


Definition 2-10. Let a be any real number. Let P be the point (1, 0) 
on the unit circle, and A the point with coordinate a, measured as arc 
length from P on the unit circle, positive coordinates being measured 
counterclockwise from P. Then the ray from the origin through A 
is said to make an angle with value a with the ray from the origin 
through P. 


60 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-5 


By virtue of this definition, the same ray OA will make an angle with 
many different measures with the ray OP. If we are given a real number a, 
then the ray OA will be uniquely determined, but the same ray can also 
be said to make an angle with measure a + 27k with the ray OP, where 
k can be any integer. 

Once we have a clear understanding of this ambiguity, we may use 
looser terminology without creating confusion. We often see a phrase 
such as “an angle of 7/2,” and we need to decide exactly what is meant 
by a phrase such as this. 

First of all, we will assume that the above definition has been extended 
so that we can speak of the ray QR making an angle whose measure is a 
with the ray QS, where Q is an arbitrary point in the plane and QS js an 
arbitrary ray from that point. This can be done with the help of transla- 
tion and rotation, two of the euclidean motions which we will assume 
known. 


Definition 2-11. Let a be any real number. Then by an angle of a, we 
mean any oriented geometric angle which is such that the terminal ray 
makes an angle with value a with the initial ray in the sense of the 
definition above. The number a will then be called a value or measure 
of the angle. 


9 


The phrase “an angle of a” introduced in this definition is merely a 
short way of saying “an oriented geometric angle in which the terminal 
ray makes an angle with measure a with the initial ray.” The student will 
find that the word angle is in common use to mean either the geometric 
angle or the numerical value attached to that angle. Usually, this dual 
usage will give no difficulty, and the student can determine the particular 
meaning desired by the context. 


97/2 


Figure 2-10 


For pictorial purposes it is convenient to show the measurement of the 
angle along a spiral rather than the unit circle, as in Fig. 2-10. This figure 
illustrates the angle with measure 97/2. The spiral allows us to count the 
amount of rotation needed to obtain this angle. 
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A given geometric angle determines many angles in the sense of this 
definition, but each real number determines only one geometric angle. 
The different numerical values for a given geometric angle differ by in- 
tegral multiples of 277. Of all these, exactly one, a, will lie in the interval 


0O<a < 2rn. 


This is most easily seen by thinking of the pair of rays cutting the unit 
circle, and noting that we can get from the first to the second by moving 
counterclockwise through some arc of less than 27 in length. If the two 
rays coincide, the value 0 for the angle can be used. The numerical value 
for the angle in this interval can be used as a standard value. 

Sometimes, however, it is more convenient to use the interval 


—T<a<T 


for the standard value. The student should satisfy himself that angles 
can be reduced to this range as well as to the range 0 < a < 27. 

Angles can be added geometrically or numerically with the same results. 
Thus, for example, if the angle from the ray OP to the ray OA has the 
value a and the angle from the ray OA to OB has the value 8, then the 
angle from OP to OB has the value a + 8. Note that when we use the 
terminology of the definition and say that the angle from OP to OA has 
the value a, what we really mean is that a is one of the infinite number of 
possible values that can be given as the value of this angle. The number a 
determines a unique geometric angle, but not conversely. So long as this 
is remembered, no confusion need arise. 

In a similar way, we see how angles can be subtracted. In particular, 
we can take the negative of an angle. Thus, in Fig. 2-11, 8 is the value of 
the angle from OA to OB while the angle from OB to OA would have the 


value —B. 
B B A 


a+B 


Ge 


We have defined angles in great generality—as signed angles in the 
plane. In actual practice, there are times when the full amount of gen- 
erality is not needed. For example, while we have defined only the angle 
from one ray to another, it is sometimes unnecessary to distinguish be- 
tween the rays. Then we need only speak of the angle between the two rays. 
The value of such an angle is usually taken to be positive and in the in- 


Fiaure 2-11 
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terval 0 < æa < mw. This would be the absolute value of 8, where £ is the 
standard value for the angle, chosen so that —a7 < B < m. When we 
consider the angle between two rays in space, we cannot use signed values, 
since a sense of direction of rotation cannot be given as it can in the plane. 

There are many possible units of measurement of length. We have 
assumed a natural unit of measurement in the cartesian plane, and in 
terms of this unit of length, a circle whose radius is one will have a cir- 
cumference of length 277. We could, however, introduce other units of 
arc length and use them to measure angles. For example, if we assign the 
length of 360 to the circumference of the unit circle, then in using Defini- 
tion 2-10 we would obtain other numerical values for the same angles. 
In this particular case we would obtain a measurement of the angles in 
degrees. We indicate this by means of a symbol which shows the unit of 
measurement we use. For an angle measured in degrees, we use the 
symbol °. 

This point needs to be stressed. Whenever a particular unit of measure- 
ment is used in describing an angle, that unit must be given or implied. 
Thus, we can describe an angle as having the value 45°, but the degree 
sign is essential and cannot be omitted. After all, we recognize the state- 
ment, “This board is 10 long” as nonsense. On the other hand, we do speak 
of a line segment of length 2 in the cartesian plane. Two what? Two 
units of length—the unit of length that is given as part of the cartesian 
plane. In this case we understand the unit. Similarly we do not indicate 
the unit when we measure an angle in radians. This makes it even more 
important to give the unit when we measure the angle in any other way. 


PROBLEMS 


1. Make sketches showing angles of 


(a) 2/2 (b) r (c) 3r/2 (d) 2r 

(e) —r/2 (f) —r (g) —3r/2 (h) —2r 
(i) 1/4 (j) —7r/4 (k) 1/3 (1) 5r/6 
(m) 157/4 (n) —9r/2 (0) 127r (p) 2537/4 


2. For each of the angles in Problem 1, give a value in the range 0 < a < 27 
for the same oriented geometric angle. 


3. The angle of 27 (radians) is the same as the angle of 360°. 


(a) If a is the value of an angle given in the radians, what is the formula for 
the value of the angle in degrees? That is, if an angle has values a and 
a°, thena = ? 

(b) Given an angle with a value of a degrees, what is its measurement in 
terms of radians? 
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4. Convert each of the angles in Problem 1 to degrees. 
5. How many degrees is an angle of 1 radian? 


6. Using the approximation 7 = 3.1416 and the fact that an angle of 6400 mils 
is the same as an angle of 27, find the angle of one mil in radians to three 
significant figures. What do you think “mil” stands for? 


7. Suppose that the angle from the ray OP to the ray OA has the value a, the 
angle from OA to OB has the value 8, and the angle from OP to OB has the 
value Y. Why is it not necessarily true that a + 8 = Y? How does this 
situation differ from that of the statements made about Fig. 2-11? 


2-6 THE TRIGONOMETRIC FUNCTONS 


The student is probably familiar with the standard trigonometric func- 
tions of angles. In this section we will give a definition of these functions 
which may appear different from that which the student has seen before. 
The functions are the same, however; we only change the definitions so 
as to make the particular properties which we wish to emphasize as easy 
to see as possible. 


Definition 2-12. Let a be any real number, let P be the point (1, 0) on 
the unit circle, and let A = (£a, Ya) be the point on the unit circle 
such that the angle from OP to OA has the value a. Then the sine and 
cosine functions of a are defined by 


COS a = Ta, 


sin a = Ya. 


The functions sine and cosine as defined here are functions whose domain 
is the set of all real numbers. Since the points on the unit circle never 
have coordinates outside of the range — 1 to +1, the range of these functions 
is the interval from —1 to +1 (inclusive of the endpoints). It is easy to 
picture the behavior of these functions. Starting at a = 0, as a increases 
we can follow the x-coordinate, for example, and observe the behavior 
of cosa. In this way, we see that the functions sma and cosa have 
graphs as shown in Fig. 2-12. 

In Fig. 2-12, the horizontal axis is the a-axis. A feature of these graphs 
which should be noted is the fact that they repeat with a period of 27. 
The functions sin a and cos a are examples of periodic functions since they 
satisfy the conditions that for any a, 


cos (27 + a) = cosa, 
sin (27 + a) = sina. (2-18) 
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y=sin a 





Figure 2-12 


This follows from the fact that increasing the angle by 27 does not change 
the geometric configuration of the rays, and it is the oriented geometric 
angle that determines the values of the functions. 


Definition 2-13. Let f(x) be a function defined for all real z. Then f(z) 
is said to be a periodic function with period p if for every x 


f(z + p) = f@). 


In terms of this formal definition, we see that Eqs. (2-18) state that the 
sine and cosine functions are periodic with period 27. 

If the ray from the origin through A makes the angle a with the positive 
x-axis, then the continuation of this ray through to the other side of the 
origin makes the angle m + a with the positive x-axis. If A is the point 
(Xo, Yo), this opposite ray cuts the circle at the point (—zo, —yYo) (make 
a sketch and verify this, noting the similar triangles formed). This shows 
that for any a, 

cos (7 + a) = —cosa, 


sin (7 + a) = —sin a. (2-19) 


Next, imagine the plane reflected in the z-axis. Any point (x,y) will 
be reflected to the point (x, —y) (why?), and the points on the unit circle 
will be reflected to the points also on the unit circle (why?). If A isa 
point such that the arc from P to A is of signed length a, then A will be 
reflected to a point A’ such that the arc from P to A’ is of signed length 
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—a. But A = (cosa, sina) and A’ = (cosa, —sin a), so we have that 
for any a, 
cos (—a) = cosa, 


sin (—a) = —sin a. (2-20) 


From Eqs. (2-20) we see that if we know the values of cos a and sin a 
for every positive a, then we would know the values for every a. However, 
use of Eqs. (2-18), repeated as many times as necessary, shows that it 
suffices to know the values for a between 0 and 27. With the help of (2-19) 
we see that we need only the values between 0 and 7. We can reduce this 
still further by the following reasoning. 

Observe Fig. 2-13. We show the two rays OA and OA’ where OA makes 
the angle a with the positive z-axis, and OA’ makes the same angle a 
with the positive y-axis. Thus OA’ makes the angle 7/2 -+ a with the 
positive z-axis. The y-coordinate of A is sina and is indicated by the 
vertical arrow in Fig. 2-13. It is clear that the displacement of A’ in the 
horizontal direction, as indicated by the horizontal arrow, is of the same 
numerical value but of opposite sign (since this displacement starts in 
the negative direction). Hence we see that cos (7/2 + a) = —sina. 

In exactly the same way it can be seen that sin (7/2 + a) = cosa. 
That is, we have the relations 


COs (z + a) = —sin a, 
2 
. [r 
sin (z + a) = COS Q. (2-21) 


Although our diagram indicates this only for a between 0 and 7/2, it can 
be seen that the argument is valid for any value of a by observing what 
happens in Fig. 2-13 as a is allowed to change to any value. By means 
of these equations we can reduce the problem of finding the sine or cosine 
of any a to the same problem for a between 0 and 7/2. Indeed, it suffices 
to know the values between 0 and 7/4, since we can combine (2-20) 
and (2-21) to see that 


COS (z — ) = cos E + (—a)| 





= —sin (—a) 
= sina 
and 
sin (z — ) = sin z + (—a)| 
= cos (—a) 


COS a. Figure 2-13 
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If a is between 7/4 and 7/2, then 77/2 — ais between 0 and 77/4. Hence 
the relations 
T . 
cos (z — ) = sin a, 


sin (z — ) = COS a, (2-22) 


serve to complete the proof of the observation that the values of sin a 
and cosa for a between 0 and 7/4 serve to define these functions for all 
possible a. This is the reason that trigonometric tables are normally given 
only for this range (except that the last relations are usually built into 
the tables). 

The use of these relations in practice is fairly simple. For example, if 
we wish to find sin (—1977/3) we first observe that this is —sin (1977/3) 
from (2-20). Now 1977/3 = 65m + 27/3, and 27/3 = 7/2 + 7/6, so 


sin (—1977/3) = —sin (1977/3) by (2-20) 
= —sin (647 + w + 277/38) 
= —sin (7 + 27/3) by (2-18) 
= sin (27/3) by (2-19) 
= sin (7/2 + 7/6) 
= cos 7/6 by (2-21). 


The student might find it difficult to memorize all of these relations. 
Luckily, there is a simple pair of formulas from which the above set of 
formulas can all be obtained. These “addition formulas” will be derived 
in the next section. However, some of these relations are so important 
that they should be learned in their own right. The equations (2-18) must 
be known without question. The equations (2-20) are also so useful that 
they should be known. It helps in learning these relations to recall the 
geometric picture. 

The formulas in (2-22) are of frequent utility, and since they are fairly 
easy to learn, it is recommended that the student learn these also. 

There is still another relationship between these functions which is 
immediately available from the definition and which is of fundamental 
importance. For any a, the values of cos a and sin a are the coordinates 
of a point on the unit circle. This point is at a distance 1 from the origin, 
and hence 

sin? a + cos? a = 1. 


This relation is one of the fundamental properties of the trigonometric 
functions and is important enough to be restated as a theorem. 
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Theorem 2-7. For any real number a, the functions sina and cosa 
satisfy the relation 
sin? a + cos? a = 1. (2-23) 


Note that the sine and cosine functions have been defined as functions 
on the real numbers, but they are also, in a natural way, defined as func- 
tions on oriented geometric angles. Relation (2-18) is what is important 
in this regard. The convention we introduced for writing the measure of 
an angle in radians as a pure real number (without units) permits us to 
think of the trigonometric functions as functions of either angles or real 
numbers. 

We also allow the use of notation such as sin 45°. Here we write the 
function as a function of the angle, indicating this by the use of a degree 
symbol to show the units. With such a convention, note that we can write 


; Soau ata 
sin 60° = sin 3 
and other similar relations. 
An important question which comes up frequently is the extent to which 
an angle is determined by the trigonometric functions. The answers to 
this question can be summarized as follows: 


Theorem 2-8. If two numbers a and b are given such that a? + b? = 1, 
then there is a unique a in the interval 0 < œ < 27 such that a = 
cos a and b = sin a. 


Theorem 2-9. If a number a with la| < 1 and a sign +1, or —1 are 
given, then there is a unique a in the interval 0 < a < 27 such that 
a = cosa and sina has the given sign (or is zero). Likewise there is a 
unique a’ with 0 < a’ < 27 such that a = sina and cosa has the 
given sign (or zero). 


Theorem 2-10. If a number a with |a| < 1 is given, then there is a 
unique a in the interval 0 < æa < vr such that cosa = a. 


The first of these results is evident when we note that the given con- 
dition on a and b is exactly what is required to have the point (a, b) be 
a point on the unit circle. 

The second result follows from the observation that the line x = a cuts 
the unit circle at exactly two points (unless a = +1 or —1, in which case 
there is only one point). The y-coordinates of these two points have op- 
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posite signs, and only one of them has the given sign and thus determines a. 
The second half of this result follows in the same way by considering the 
line y = a. 

The third result follows from the second when we note that sina > 0 
for 0 <a < rT and sina < 0 for 7 < a < 27. Hence, although the 
line x = a cuts the unit circle at two points in general, only one of these 
points will correspond to an angle in the required range. This particular 
result can also be interpreted as saying that the cosine alone serves to de- 
termine the geometric angle (but not, of course, the oriented geometric 


yY 









y= sec a 





FIGURE 2-14 
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angle). Note that the converse of this results also holds. A geometric 
angle has a single cosine. This follows from (2-20), since the two choices 
of an oriented geometric angle would have measures which are negatives 
of each other. 

The remaining trigonometric functions can be defined in terms of the 
sine and cosine. They are useful in practice and should be known. At this 
time it is sufficient to know the definitions of these functions. 

These remaining functions are the tangent, cotangent, secant, and co- 
secant functions. They are defined as follows: 


Definition 2-14. For any a for which the denominator of the given ex- 
pression is not zero, 














sin a COS @ 
tana = ) cot a = = ) 
COS a sin a 
1 1 
sec a = , csc a => — . 
COS a SIn a 


The graphs of these functions are sketched in Fig. 2-14. 

All of the relations (2-18) through (2-22) clearly hold when the cosine 
is replaced by the secant and the sine is replaced by the cosecant (why?). 
For the tangent and cotangent, however, somewhat different relations 
hold. Putting the relations (2—20) into the definition, we see that 


tan (—a) = —tana, 
cot (—a) = —cot a. (2-24) 
From relations (2-19) we have 


tan (7 + a) = tana, 
cot (7 + a) = cota, (2-25) 


which shows that the functions tan a and cot a are periodic with period 
m (half of that of the remaining trigonometric functions). 
Finally from (2-21) and (2-22) we deduce that 


tan (z + a) = —cot a, tan c — a) = cot a, (2-26) 


and 


cot c + a) = —tana, cot (z = ) = tana. (2-27) 


A few other relations will be given as problems. 
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PROBLEMS 
. For what values of æ is 
(a) sna = 0? (b) cosa = 0? 
(c) sna = 1? (d) cosa = 1? 
(e) sna = —1? (f) cosa = —1? 


. What conclusion can be made from Eqs. (2-22) when a = 7/4? Show that 


(2-23) then can be used to find the values of sin 7/4 and cos 7/4. What is 
tan 17/4? 


. Reduce each of the following to a trigonometric function of an angle between 
0 and 7/4. 
(a) sin (7r/5) (b) cos (—17r/2) 
(c) sin (3157/3) (d) sin (—1117/10) 
(e) cos (2187) (f) sin (—15r/2) 
(g) tan (35r/3) (h) cot (—12r/5) 


. Rewrite relations (2-18) through (2-22) in terms of angles expressed in de- 


grees. 


. Reduce each of the following to a trigonometric function of an angle (ex- 


pressed in degrees) between 0° and 45°. 


(a) sin (337°) (b) cos (—1000°) 
(c) sin (—2345°) (d) cos (112°) 
(e) tan (535°) (f) cot (1800°) 
(g) sec (215°) (h) esc (—7000°) 


. Prove from (2-23) that 


1 + tan? a = sec? a and 1+ cot? a = csc? a 


for any æ for which the functions are defined. 


. Let OA be the ray which makes an angle of a with the positive z-axis. Prove 


that the line through O and A has slope tana. What is the meaning of the 
relation (2-25) in this context? 


. Let OA be the ray which makes an angle of a with the positive z-axis. Prove 


that the line through O and A intersects the line x = 1 at the point (1, 
tana). Make a sketch showing this for angles in all four quadrants 0 to 
w/2, 7/2 to 7, 7 to 32/2, and 37/2 to 2m. 


. Let Q be a point on the ray from the origin which makes an angle a with 


the positive z-axis. Suppose that Q is a distance c from the origin. Prove 
that Q has coordinates 
(c cos a, c sin q). 


Show that if f(x) is periodic with period p, then it is also periodic with period 
kp, where k is any nonzero integer. 
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2-7 TRIANGLE FORMULAS 


A right triangle has two sides which meet at a right angle. The other two 
angles formed by the sides of the triangle are taken as unoriented angles 
to which we can assign values between 0 and 7/2. Suppose that the tri- 
angle is placed on the cartesian plane so that one side is on the z-axis 
and a vertex (not at the right angle) is at the origin, as in Fig. 2-15. Leta 
be the value of the angle between the hypotenuse and the side along the 
z-axis. Let a be the length of the side of the triangle opposite this angle, 
let c be the length of the hypotenuse, and let b be the length of the re- 
maining side. 





Figure 2-15 


Extend the hypotenuse if necessary, and locate the point Q which is 
the intersection of the hypotenuse with the unit circle. Drop a per- 
pendicular from Q to the z-axis. This forms another right triangle which 
is similar to the given triangle. But this new triangle has sides of length 
sin a (the vertical side), cos a (the horizontal side), and 1 (the hypotenuse). 
From the similarity, we can conclude that 


b 
CoS a >= -» 
c 
sin a = = (2-28) 
tan a = 5 . 
The first two of these can be written in the form 
a = c sN ag, 
(2-29) 


b = c cosa, 


which allows us to compute the length of the sides of the right triangle if 
we know the length, c, of the hypotenuse and the value of one of the base 
angles, a. 

If a line segment AB is given in the plane along with a line L, then 
we may draw lines AA’ and BB’ through the respective points A and B, 
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perpendicular to the line L and meeting L at A’ and B’ respectively. The 
line segment A’B’ is called the projection of AB on L. What is its length? 
By adding the dashed line in Fig. 2-16, it is easy to see with the help of 
(2-29) that 

|A’B’| = |AB| cosa, (2-30) 


where æ is the angle between the line L and the line through A and B. 
When two lines intersect, four rays are determined. These determine 
4-2 = 6 geométric angles. Two of these are straight angles, and the 
remaining four are equal (in value) in pairs. The two values will be a 
and wz — a. Use the smaller in (2-30). The other would give the negative 
of the correct result, since 


cos (7 — a) = —cosa 


from the formulas of the last section. The concept of projection will be 
discussed more fully in the next chapter. 
B 


FIGURE 2-16 a e L 
Suppose now we have an arbitrary triangle, not necessarily a right tri- 
angle. Suppose that it has interior angles whose values are a, 8, and Y (all 
taken as positive angles in the interval from 0 to 7), and that the sides 
opposite the angles a, 8, and Y are of lengths a, b, and c respectively. We 
know that the area of a triangle is given by 4 the product of the length of 
one of the sides with the altitude perpendicular to that side. Thus, if we 
drop a perpendicular from the vertex with angle 8 to the opposite side, as 
illustrated in Fig. 2-17, and if the length of this altitude is h, then the area 
of the triangle is 
A = $bh. 


However, the altitude drawn is a side of 
a right triangle. Hence 


h = csina, 
and thus 





A = $bcsina 
abc sin a 


2 a FIGURE 2-17 
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In exactly the same way, using the same altitude, we have 





h = asin Y 
and 
A = ġbasin Y 
_ abe sin Y 
~ 2 c 


By dropping an altitude from one of the other vertices we can show in 
the same way that 


abe sin B 


A 





If one of the angles is greater than 7/2, as 
in Fig. 2-18, it is necessary to note that 





sin (m — 8) = sing (2-31) 


in proving that Figure 2-18 
h = csin ß 


by the method described. Since this relation follows from the results of 
the last section, we can conclude that these three representations of the 
area are valid. They are equal, and hence dividing through by abc/2, we 
have 
sing sing _ sinY 
a b c 








(2-32) 


The relations (2-32) are valid for any triangle. This result is known as 
the Law of Sines, and can be used for several purposes. In particular, when 
two angles of a triangle are known, the third angle is also known since the 
sum of the angles in any triangle is r, and then if any side is known, the 
relations (2-32) can be used to find the two remaining sides. 

For example, if we know that sina = 4, sin 8 = #, and a = 3, then 
we can compute b from the law of sines by 


a : 3 {2 
b= st sing = 2 (2) 
_ 18 
— $ 


If the law of sines is used to determine an angle of a triangle, ambiguity 
results. Since sin (m — a) = sina, finding the sine of an angle does not 
completely determine the angle. There are always two possibilities (except 
when sin a = 1), one less than 7/2 and one greater than 7/2. Thus, for 
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~ 





FigurE 2-19 
IGUR 5 


example, if a = 2,c = 3, and sina = +4, then 


sin @ 
a 


siny = Tee- (gy =F. 


Hence, as we will prove in one of the problems at the end of this section, 
Y = 1/6, or Y = 57/6. 

This example is illustrated in Fig. 2-19. Both the angle labeled a and 
the angle labeled a’ have a sine whose value is å. However, the dashed line 
cannot be the side of a triangle meeting the required conditions, since no 
point of this ray is at a distance 2 from the point B. The points C and C” 


are both at a distance 2 from B. The two triangles which satisfy the given 
conditions are therefore OBC and OBC’. 


(c cos a, c sin a) 





FIGURE 2-20 


Turning to another topic, suppose that the sides of a triangle are of 
lengths a, b, and c and that a is the angle opposite the side of length a. 
Position this triangle in the cartesian plane so that the vertex opposite 
the side of length a is at the origin and the angle a is measured from the 
z-axis to the side of length c (as shown in Fig. 2-20). The coordinates of 
the vertices are then as shown, and hence the distance formula gives us 


a? = (ccosa — b)? + c? sin? a 


= c* cos* a — 2bc cos a + b? + c? sin? a 
= b? + c*(sin? a + cos? a) — 2bc cos a, 


or, since sin? a + cos? a = 1, 


a? = b? + c? — 2be cosa. (2-33) 
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This result 1s called the Law of Cosines. Like the law of sines it can be 
used to determine missing parts of a triangle. When used to determine 
an angle (when all three sides are known) there is no ambiguity, since the 
angle a must lie between 0 and 7 and there is only one such angle for a 
given value of the cosine. 

As it is written, this relation can be used to determine the third side of a 
triangle when two sides and the included angle are given. If two sides and 
an angle other than the included angle are given, this relation results in a 
quadratic equation in the length of the third side. There are in general 
two possible solutions. 

Note that there are two other versions of the law of cosines in addition 
to (2-33). There is one for each angle in the triangle. These relations are 
easy to remember, since the equation says that the square of a side is 
equal to an expression involving the cosine of the angle opposite the side 
and the other two sides. All that is necessary is to remember the form of 
the expression. 

Let us now give a few examples showing how the law of cosines can be 
used to find the missing parts of triangles. 

First, suppose we are given that a = 4, b = 4, andc = 1. We wish 
to find the angles of the triangle. Actually, it suffices to find the cosines 
of the angles, and so we use the law of cosines. To find cos £, for example, 
we use 

b? = a? + c? — 2ac cos, 


or 
2 2 2 
=e = b" Sal 4 
a 2ac | 8 
eae 
— 8 


The cosines of the other angles could be found in a similar way. 

For our next example, suppose that a = 3, b = 5, and cos Y = §. 
Then we can solve for c by direct use of the law of cosines in the following 
way: 

c? = a? + b? — 2ab cos Y 
9+ 25 — 15 
= 19. 


Therefore, we find that c = 1/19. The other angles can be found as above. 
As a last example, suppose we are given that a = 2,/13, c = 6, and 
cosa = 4. We attempt to find b from the law of cosines, using 


a? = 6? + c? — 2be cosa. 


76 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-7 


Putting in these values, we find 
b? — 6b — 16 = 0, or (6 — 8)(b+ 2) = 0. 


Of the two roots, only the positive one has a meaning in this case, so that 
we conclude that b = 8. The student should make a sketch to see why 
there is only the one solution in this particular case. 

Observe that in this section we are interested in the theoretical rather 
than the practical use of these formulas. For this reason we do not include, 
or consider the use of, tables of trigonometric functions. We consider an 
angle of a triangle as being known when we know its cosine, since there is a 
unique correspondence between the cosine and the angle when the angle 
is restricted to the range 0 < æ < mT. 

This is not meant to imply that the student should not know how to 
use the trigonometric tables. He should. In working actual problems, it is 
usually easier to work with the tables than to use the methods we discuss 
below. We are interested, however, in the existence of analytic methods 
which do not require the use of tables. 

There is no difficulty, in theory, in finding the sine of an angle if its 
cosine is known since sin? a = 1 — cos’ a, and the sine of any angle 
between 0 and 7 is positive. There is, of course, some ambiguity in going 
the other way. The real difficulty we find in trying to use the cosines of 
the angles is in applying the fact that the sum of the angles of a triangle 
is m. If we know cos a and cos 8 and we wish to find cos 7, then using the 
results of the last section, we find 


cos Y = cos [r — (a + 8)] 
= —cos (a + £); 
but how do we find cos (a + 8) in terms of cos a and cos 8? 
The answer is in the use of the trigonometric addition formulas. These 
are 
cos (a + 8) = cos a cos 8 — sin asin £, 
sin (æ + 8) = sin a cos 8 + cos a sin £. (2-34) 
Let us now prove the first of these. 
In Fig. 2—21(a), we show the unit circle cut by rays so that the value 
of the angle from OP to OA is a and the value of the angle from OA to 
OB is 8B. Then the value of the angle from OP to OB is a + 8. We as- 


sume that the point P has coordinates (1, 0). Then the point B has 
coordinates (cos (a + 8), sin (a + 8)), and therefore, 


|PB|? = [cos (a + 6) — 1]? + sin? (a + 8) 
cos’ (a + 8) + sin? (a + B) + 1 — 2cos (a+ B) 
2 — 2 cos (a + 8). 
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FigurE 2-21 


Consider the same figure, but rotated so that the point A is moved to 
A’ on the z-axis. In the resulting figure, Fig. 2—-21(b), the point P’ has co- 
ordinates (cos (—a), sin (—a)) = (cosa, —sin a), and the point B’ has 
coordinates (cos £, sin 8). Therefore, we can compute 


|P’B’|? = (cos 8 — cosa)? + (sin 6 + sin a)? 
= cos? 8+ sin? B+ cos? a+ sin? a — 2 cosa cos 8+ 2 sin a sin B 
= 2 — ?2[cos a cos 8 — sin a sin £]. 


However, |PB|? = |P’B’|?, and hence we can conclude that 
cos (a + 8) = cos a cos 8 — sin a sin £, 


which is what we wished to prove. 
It is useful to obtain a similar formula for the cosine of the difference 
of two angles. This is easily done as follows: 


cos (a — 8) = cos [a + (—8)] 
= cos a cos (—8) — sina sin (—8) 
= cos a cos 8 + sin ea sin £$. 


The second formula in (2-34) can now be obtained with the help of 
formula (2-22), 


sin (a + 8) = cos| = faa B| 
a e 
cos (7 — ) cos 8 + sin (3 — ) sing 


= sin & cos B + cos asin £, 


which is the second equation of (2-34). 
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Thus we see that if the values of cosa and cos8 are known for two 
angles in a triangle, then the cosine of the third angle is given by 


cos Y = —cos (a + 8) 
= —cosacos6 + sina sin £. 
In using this result, it is necessary to recall that the sine of an angle in a 
triangle is always taken to be positive. 
As an example of the use of these equations, let us find the missing sides 
and angles of the triangle in which a = 1, cos8 = }4, and cosY = — 4. 


We see that we can use the law of sines to determine the missing sides 
once we have determined the third angle. To this end we compute 


sin 6 = (1 — 484)"? = GE)? = V5, 
sin Y = [1 — 3]? = 4v15, 
cos a = —cos ĝ cos Y + sin 8 sin Y 
= 4+4 
— 7 
Be. 
We could compute sin «æ from this, or from 


sin a = sin [r — (6+ Y)] 
= sin (6 + 7) 
= sin B cos Y + cos 8B sin Y 
= —&vV15 + 4v15 

IV 15. 


With these values, we can then use the law of sines to compute 








paine _ 3 
sin a 2 
and 
sin Y 
~ l sina _ 
PROBLEMS 


1. Let a right triangle have sides a, b, and c opposite the angles a, 8, and Y re- 
spectively, Y being the right angle. For each of the following write an expres- 
sion for the required side in terms of the given quantities. 

(a) Givena = 7,a = rt/4. Findc (b) Given a = 3,8 = 3r/5. Find b. 
(c) Given a = 5 and æ. Find b. (d) Givenc = 7 anda. Find a. 

(e) Given c and 8. Find a. (f) Given b and a. Find c. 

(g) Given a and b. Find c. 
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. In an equilateral triangle, all three sides are of the same length and all three 


angles are 7/3. 


(a) Use (2-33) to find cos 7/3. 
(b) Find sin 7/3. 
(c) Use (2-22) to find sin 7/6 and cos 7/6. 


In each of the following problems, a, b, and c are the sides of a triangle opposite 


the angles a, 8, and Y respectively. Using the values given, find the lengths of 
the missing sides and the cosines of the missing angles. If there is more than one 
triangle satisfying the given conditions, give the values for each of the possible 
triangles. If there are no triangles satisfying the given conditions, explain why. 


3. 


4. 


(a) a=2, b=4, c=5 (b) a = 2, b= 11, ¢ = 10 
(c) a =3, b=4 c=6 (da =3, b=8, c=4 
(a) a = 5, b= 6, cosY = 43 (b)b = 7, c =6, cosa = x5 
(c) a = 2, c = 4, cos = — 5% (d) a = 1, c = 2, cosB = 4 
. (a) a = 8, b=5, cosa=— (b)a=7, c = 6, cosa = —3 
(c) a = 2, b= 3, cosB = 3 (d)b = 5, c = 4, cosy = 2 
(e) a = 3, b= 6, cosa = 7 

: (a) c = 5, cosa = Ha, cos B = ae 

(b) b = 6, cosa = #3, cosY = x5 

(c) a = 4, cosB = —2, cosY = 7 

(d) b = 10, cosB = 4}, cosy = 3 


. Construct another proof of the law of sines along the following lines: Let 


A, B, and C be the vertices of a triangle with angles a, 8, and Y at these 
vertices respectively. Let O be the center of the circle circumscribed about 
the triangle. For some vertex, say A, choose another vertex, say B, and draw 
the line from B through the center O of the circle to intersect the circle again 
at A’. Draw A’C. Show that the angle BA’C is the same as a. Show that 
the triangle BA’C is a right triangle. Compute |BC| in terms of a and the 
diameter of the circle. Repeat the same process for the other vertices and use 
the results to obtain the law of sines. 


a 


Vectors 


3-1 CARTESIAN COORDINATES IN THREE-DIMENSIONAL SPACE 


Just as the geometry of the plane is characterized by two real coordinates, 
so the geometry of three-dimensional space can be characterized by three 
real coordinates. In order to do this, we need to have a fixed set of co- 
ordinate axes in space. These may conveniently be chosen to be a set 
of three mutually perpendicular lines intersecting in a single point. 

Suppose that we have such a set of lines. We will assume that each is 
directed and that we have a fixed unit of distance in our three-dimensional 
space which can be applied to each of the lines to make it a coordinate line 
or axis. On each of the lines, a point is determined by a single real number, 
or coordinate. Through that point a unique plane can be constructed 
perpendicular to that coordinate axis. And since the three coordinate lines 
are mutually perpendicular, three points, one on each axis, determine three 
mutually perpendicular planes. Any two of these planes intersect in a 
line, and the third cuts this line at a point. In this manner three real 
numbers determine three distinct points on the axes, which in turn de- 
termine a unique point in the three-dimensional space (see Fig. 3-1). 





FIGURE 3-1 


Conversely, given any point in space, it determines three planes, re- 
spectively perpendicular to the three axes, and these three planes cut the 
axes in unique points which then correspond to three coordinates. There- 
fore, there is a one-to-one correspondence between the points of space 
and sets of three real numbers. 

80 
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Note that we are making use of the properties of euclidean three- 
dimensional space. If we start from the axioms of euclidean geometry, it 
can be shown that three mutually perpendicular directed lines do exist 
and that the above discussion is valid. This is not done here since our 
point of view is going to be the opposite. We will treat the mathematical 
system we develop as an independent entity, which can be thought of as 
a model of euclidean geometry. Throughout the discussion, however, we 
will base this development on the comparison with euclidean space as 
we usually understand it. The discussion should help the student to see 
the relationship between mathematics and the physical world. While 
the student should concentrate on building a geometric picture of the 
mathematical system, he should also realize that the validity of this 
picture is not being proved. Indeed, it is more accurate to say that we 
are postulating the fact that the geometry of space is determined by the 
mathematical system which we shall develop. 

There is one point, however, on which agreement must be reached 
before a physical picture of the mathematical system can be said to be 
known. Suppose we have three mutually perpendicular directed lines 
passing through a single common point, and we wish to label these as the 
axes of our system, say the z-, y-, and z-axes. There are six possible ways 
of assigning these labels (why?), but these six fall into only two groups 
which need to be distinguished. 

Note the three ways of labeling the axes shown in (a), (b), and (c) of 
Fig. 3-2. Each can be changed into any other by the rigid euclidean 
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motion of rotation; however, the arrangement shown in (d) cannot be so 
transformed without using reflection. For, if we leave the z-axis fixed 
and attempt to transform Fig. 3-2(d) into Fig. 3-2(b) by rotating about 
the z-axis until the y-axes coincide, then the z-axes of the two figures 
will be pointing in opposite directions. 

Extend the index finger of your right hand, hold the thumb perpen- 
dicular to the index finger but in the same plane as the rest of the hand, 
and turn the middle finger inward so that it is perpendicular to the palm 
and both the index finger and thumb. In this position, the hand can be 
rotated so as to bring the thumb into the position of the positive z-axis, the 
index finger into the position of the positive y-axis, and the middle finger 
into the position of the positive z-axis for any of the three arrangements 
shown in Fig. 3-2(a), (b), or (c). However, the arrangement of Fig. 3-2(d) 
cannot be obtained. This arrangement of axes would require the left, hand. 

Henceforth, we will agree on the arrangement of axes shown in Figs. (a), 
(b), and (c). This is called a right-handed coordinate system, since the 
thumb and the first two fingers of the right hand can be put into the 
positions of the z-, y-, and z-axes respectively as described above. Al- 
though this convention is immaterial to the mathematical development, 
it would be essential in any consideration of applications to know exactly 
which system was being used. 

Let us assume now that we have a fixed right-handed coordinate system. 
As described above, three coordinates serve to identify uniquely one point 
in space. We shall use the convention of writing down the coordinates in 
the order of the z-, y-, z-coordinates respectively and enclosing the re- 
sulting triple of numbers in parentheses. Thus a triple of numbers such 
as (1, 2, —1) represents the coordinates of some point, but we will actually 
go further and identify this triple with the point. That is, when it is 
understood that a coordinate system has been fixed, we may speak of 
a point (xı, Yı, 21) rather than having to say the point with coordinates 
(11, ¥1, 21). Moreover, we will use a single letter to label the point. This 
letter then would represent both the point and the triple of numbers 
giving the coordinates of the point. 


Definition 3-1. A point of three-dimensional space is an ordered triple 
of numbers. Points will be denoted by capital italic letters, e.g., P = 
(£1, Y1, 21). The particular point (0, 0, 0) is called the origin. 


This definition is the start of the construction of a mathematical model 
of three-dimensional space. We are not saying that the points of the 
space in which we live are really triples of numbers, but that the set of 
all triples of numbers can be used to represent this space. The mathematical 
model will give us something that we can work with algebraically. It is 
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FIGURE 3-3 


actually three-dimensional cartesian space with a fixed intrinsic coordinate 
system. 

Suppose we have two points given, say Pı = (z1, Y1, Z1) and P = 
(£2, Y2, 22). We would like to give a formula for the distance between 
these two points which would coincide with the euclidean concept of 
distance. Observe the set of points shown in Fig. 3-3. It is clear from the 
figure how these points are obtained, but they can be completely defined 
in terms of their coordinates. Thus A = (zı, y1, 0), B = (z2, Y2, 0), 
C re (z1, Y2, 0), D = (z2, Y2, Z1), and E = (z1, Y2, zı). (Verify all of 
these.) The points P; and E lie on a line parallel to the y-axis since only 
their y-coordinates differ. The distance between these two points is 
therefore the absolute value of the difference of their y-coordinates, or 
ly1 — Ye|. Likewise the distance between the points D and E is |x, — 2g]. 
The triangle P,ED is a right triangle, and hence from the Pythagorean 
theorem, the distance between P, and D would be 


[(£ı — z2)? + (yı — y2)?]"?. 


Similarly, the triangle P;P2D is a right triangle, the distance between P32 
and D is |z; — zə2|, and the Pythagorean theorem finally gives the distance 
between the points P; and Pz. Thus we have: 


Definition 3-2. Given two points P; = (x1, Y1, Z1) and Po = (£2, Y2, Z2), 
the distance between them is defined to be: 

|Pi1P2| = [(t1 — z2)? + (yr — yo)? + (zı — Z2)7)"/?. 
Thus, for example, if A = (7,3, —9) and B = (11, —5, —8), then 


|AB| = [(7 — 11)? + (84+ 5)? + (—9 + g)2]1/2 
SEO FE FRED 
= [16 + 64 + 1]”? 
= 9. 
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From this distance formula we may obtain the algebraic conditions 
satisfied by the points on a sphere. Since by definition a sphere is the set 
of all points which are at some fixed distance from a fixed point, we have: 


Definition 3-3. A sphere with center (£o, Yo, Zo) and radius R > 0 is 
{(x, y, 2) | (w — zo)? + (y — yo)? + (z — 20)? = R*}. 
For short, we say that 
(x — xo)? + (y — yo)? + (2 — 20)? = R? 


is the equation of the sphere. 


From this formula, it is easy to write down the equation of any sphere 
if we are given its center and the radius. Conversely, if we are given an 
equation which can be brought into this form, we can recognize it as the 
equation of a sphere and obtain the center and radius. 

The equation of the sphere of radius 4 whose center is located at the 
point (1, 3, —2) is thus 


(z — 1)? + (y — 3)? + @ + 2)? = 16, 
or 
zr? + y? +2? — 2r — ôy + 4z — 2 = 0. 


On the other hand, if we are given the equation 
2 2 2 = 
xl + yf + 2z" + 3z — 6z + 11 = 0, 
we can identify this as the equation of a sphere by completing the square: 


x? + 32 + y? + 27 — 62 = —l11, 
1? + 3r +8 +y? +z? — 6z +9= 114249, 
(x + $)? + y? + (@ — 8)? = Zh. 


Therefore, we conclude that this is the equation of a sphere of radius 4 
whose center is located at the point (—, 0, 3). 

The concept of translation will prove to be most important in our de- 
velopment. Let us investigate how this concept fits into the algebraic 
model we are developing. A translation (or parallel translation) of the 
points of space is a rigid motion of the points—that is, a motion which 
leaves the distance between any two points unchanged—and is such that 
every point is moved exactly the same distance and in the same direction. 


3-1 CARTESIAN COORDINATES IN 3-DIMENSIONAL SPACE 85 


Suppose that under such a translation the point which is originally at 
the origin is moved to the point (h, j, k), and that a point (x, y, z) is moved 
to the point (2’, y’, 2’). What is the relationship between these sects of 
coordinates? See Fig. 3-4. Through the point (h,j,k) we have drawn 
dashed lines parallel to the original axes. The point (z’, y’, 2’) will be a 
signed distance x from the plane through the point (h, j, k) perpendicular 
to the x-axis. But this plane cuts the z-axis at the point with coordinate h. 
Therefore, x’, the first coordinate of (x’, y’, 2’), must be x + h. Similarly, 
we see that y’ = y + j and z’ = z + k. This then is the background 
for the following definition: 







$ 
| 
Definition 3-4. A translation of the | Ge 
points of three-dimensional space is | e 
a function whose domain and range | 
are the space, and for which there Ey a a 
are three real numbers h, j, and k ao Di Se 
such that if the translate of a point tee ra 
(x, y, z) is the point (2’, y’, 2’), then 
Bee i 
y’ = y =e j ? (3-1) 
2=2z+k. FIGURE 3-4 


A translation is therefore a mapping which carries the points in one 
copy of three-dimensional space into the points of another copy of the 
same space. We might prefer to think of there being only a single three- 
dimensional space, and the translation as being a motion of the points of 
this space. Every point is moved from its original position to its new trans- 
lated position. 

An entirely different point of view can also be taken. Our definition 
of the points of space as being triples of numbers is merely a useful mathe- 
matical device. The more “physical” point of view would be that three- 
dimensional space has an intrinsic existence, and that the-coordinates of a 
point result from the (arbitrary) introduction of a coordinate system. 
In this picture, a translation results from the introduction of a new co- 
ordinate system in a “translated position.” The translation given by 
Eqs. (3-1) would be obtained by introducing the new 2’-, y’-, 2’-coordinate 
system with its axes parallel to the original axes, but all intersecting at 
the point (—h, —j, —k), where the coordinates of this point are given 
in terms of the original coordinate system. Whatever geometric point of 
view is taken, the algebraic form of a translation is the same. It is given 
by Eqs. (3-1). 
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Let us note that translation as defined above actually does preserve the 
distance between points. If we have two points, (11, y1, 21) and (Xo, Yo, 22), 
which are translated to (xj, yj, 21) and (z3, y3, z3), then 


= Tı +h, 
= T2 +h, 
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but then 
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and inserting these values into the distance formula of Definition 3-2, 
we find that the distance between the translated points is the same as 
the distance between the original points. 

It is important to note the distinction between a point as a physical 
location and a point as an algebraic entity. In our intuitive discussions, 
we think of a point as a physical location, that is, as a “geometric point.” 
In our formal mathematical development, we will mean the algebraic 
entity of Definition 3-2. Our development assumes a complete corre- 
spondence between these two points of view. 


PROBLEMS 


1. Find the distance between the following pairs of points: 

(a) (1, 1, 2) and (3, 7, 5) 
(b) (1,0, 7) and (6, 2, —1) 
(c) (l 4, —5) and (5, —3, —8) 

2. Find the coordinates of a point whose distance from the origin is V6, and 
whose distances from the points (0, 0,1) and (2, 2,2) are V3 and V2 re- 
spectively. Is there more than one such point? If so, how many such points 
are there, and what are their coordinates? 

3. Write the equations of the spheres: 


(a) with radius 5 and center (3, 1, —2) 
(b) with radius 2 and center (1, 0, 1) 
(c) with radius 10 and center (8, —6, 0) 


4. Show that the equation of any sphere can. be brought into the form 


x? + y? + z2 + Br + Cy + Dz+ E = 0, 
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and that any equation of the form 
Az? + Ay?+ Az?+ Br+Cy+ Dz+ EF =0 


with A = 0 can be brought into the form of the equation of a sphere. State 
the conditions under which this would be the equation of a sphere. If it is a 
sphere, what is its center and radius? 


5. Find the center and radius of each of the spheres with the following equations: 
(a) 22+ y2+ 22 — 62+ Qy+42+5=0 
(b) 27? + Qy? + 222 + 42 — 20z + 32 = 0 
(c) 322 + 3y? + 322 — 52+ 6y — 122z — 8 = 0 


6. Find the equation, center, and radius of the sphere which passes through 
the four points (3, 0, 4), (—1, 3, —1), (—2, 0, —1), and (8, —4, 2). 


7. Find the equation of the sphere with center (1, 5, —2) which passes through 
the point (2, 0, 4). 

8. What is the result of two successive translations? How could this problem 
be handled algebraically? 


9. If B is a set of points which satisfy an equation f(z, y, z) = 0, what equation 
is satisfied by the set of points obtained from B by a translation? Apply 
this to the equation of a sphere. What happens to a sphere when it is trans- 
lated? 
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By a ray we mean a half-line; that is, the set of all points which are on 
one side of a given point on a line. A sense of direction is automatically 
assigned to the ray, the positive direction being away from the given 
point. This description is not adequate in terms of the algebraic frame- 
work we are erecting, for as yet we do not know (algebraically) what a 
straight line is. Again, we proceed by letting our geometric intuition be 
our guide. Let us consider the points on a ray originating from the origin. 

Let a point P = (l,m, n), not the origin, be assumed to be on the ray 
we wish to discuss and let X = (2, y,z) be any other point on the ray. 
From these two points drop lines perpendicular to the z-axis. The one 
from P will meet the z-axis at the point with coordinate l, the other at 
the point with coordinate x. Suppose now that l = 0. (See Fig. 3-5.) 
The two right triangles formed are similar and hence the ratio of |z| to |I| 
is the same as the ratio of the distance |OX]| to |OP|. Let |OX|/|OP| = t. 
Then |z|/|l| = t. However, these two points on the x-axis are both on 
the same side of the origin and we can conclude that z/l = |z|/|l|, and 
sox = lt. 

In case | happened to be zero, the ray must have been perpendicular 
to the x-axis, in which case x = 0 and the equation x = lt would still hold. 
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Figure 3-5 FIGuRE 3-6 


In the same way, we could drop perpendiculars to the y- and z-axes 
and conclude that y = mt and z = nt, the ratio t being the same in each 
case. The above analysis then is the motivation for the definition: 


Definition 3-5. A ray from the origin through the point (l, m, n), not 
the origin, is 


{(x, y, 2) | c= lt, y = mt,z = nt, for all t > 0}. 


The coordinates of any point of the ray, other than the origin, are called 
a set of direction numbers for the ray. 


The reader can try to exercise his critical sense on this definition. Al- 
though complete as it stands, it raises an important question. After trying 
to discover this question, check with Problem 3 at the end of this section. 

Note that the number t which appears in this definition is the ratio 
IOX|/|OP|. This fact can be used, for example, to find the point two-thirds 
of the way from the origin to the point (6, —9,5). The desired point 
would be (4, —6, 42). Why? 

Suppose now we have a ray, with direction numbers l,m, n. Let this 
ray cut the unit sphere (the sphere of radius 1, centered at the origin, 
with equation z? + y? + z2? = 1) at the point (A, u,v). We might remark 
here that the postulates of euclidean geometry are sufficient to show 
that this point exists,* but the proof is quite difficult. In our case, this 
is easy to show. We merely need to set t = 1/[1? + m? + n?]!? in the 


* Euclid’s postulates are not sufficient to show this, but later extensions, 
such as the postulate system of Hilbert, are sufficient for a rigorous proof of 
this fact. 


3-2 DIRECTION COSINES AND DIRECTION NUMBERS 89 


definition above. It is simple to verify that the point (A, u, v) with 


l m 
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p= (3-2) 


n 
is common to the ray and the unit sphere. 

It is clear geometrically that there is only one such point on each ray. 
Conversely, we see that there is a unique ray from the origin through each 
point of the unit sphere. Thus, the three coordinates of this point on the 
unit sphere suffice to determine the ray completely. 

A given ray determines three angles, one between it and each of the 
positive directions along the axes. Let,a, 8, and Y be the values of these 
angles, as shown in Fig. 3-6. Consider a, the value of the angle between 
the positive direction of the z-axis and the ray. The perpendicular from 
the point (A, u, v) to the x-axis meets the z-axis at the point with co- 
ordinate A. But at the same time, since the length of the segment from the 
origin to (A, u, v) is one, the point à on the z-axis also has coordinate cos a. 
That is, A = cosa. (See Fig. 3-7.) In exactly the same way, if 8 and 7 
are the values of the angles between the ray and the y-axis and z-axis re- 
spectively, then it is seen that u = cos 8 and v = cos Y. This leads to 
the following definition: 


Definition 3-6. If (l, m, n) is a set of direction numbers for a ray, then 
the set of numbers 


i= o a ee ee, 
TF m Fn’? FTF mF nn’ (A, u, v) 
n 
y= 2 + m2? + nile 
is called the set of direction cosines of the ray. FIGURE 3-7 


For example, the ray from the origin through the point (8, —1, —4) has 
direction numbers (8, —1, —4), or (16, —2, —8), or (4, —4, —2), or any 
other set of positive multiples. But this ray has only the single set of 
direction cosines, ($, —$, —%). 

Our next step is to introduce the notion of a directed line segment. The 
essential idea is to consider a directed line segment as the translate of a 
portion of a ray from the origin. For example, suppose P = (l, m,n) 
is some point other than the origin. Then the set of all points on the ray 
from O through P which lie between O and P is 


R= {(a,y,2)|c2 = l,y = mt,z = nt,0 < t < J}, 
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and if we make a translation which carries the point at the origin to 
(Xo, Yo, 20), then R is translated to a set R’ which can be seen to be 


Rk’ = {(x’, y’, 2’) |x’ = x + T0, Y' = yY + Yo, Z = 2+ 2p, and (z, y, 2) E R} 
= {(x',y', z’) | £ = zo + lt, Y = Yo + mt, z' = zo +nt,0<t<1} 
= {(x, y, z) | £ = £o + lt, Y = Yo + mt, z = zo + nt, 0 < t <1}. 


The last step here follows from the fact that the symbols used in defining 
a set are “dummy variables.” The set is the same no matter what letters 
are used for variables. 

The ray from the origin through P = (l, m, n) has direction numbers 
(l, m,n). We assign these same direction numbers to the directed line 
segment obtained in this way by translation of the segment from O to P. 
Observe that these direction numbers are then merely the differences of 
the values of the coordinates at the two ends of the segment. Indeed, if 
Pı = (zı, Yı, Z1) is the point corresponding to t = 1, then zı = zo +l 
and hence zı — zo = l. Similarly, yı — Yo = m and zı — Zo = n. 


Definition 3-7. A directed line segment PoP, whose initial point is at 
Po = (zo, Yo, Zo) and whose terminal point is at Py = (24, y1, 21) is 
the set of points 


{(z, y, z) | x = ro(1 z t) F zıt, y = yo(1 Z t) F Yıt, 
z = zo(l — t) + zıt, for all t with O < t < 1}, 


together with the sense of direction determined by increasing t. This 
directed line segment is said to have direction numbers 


(x1 — Xo, Y1 — Yo) 21 — 2o) 


and length 
d = [(£ı — to)? + (Yı — yo)? + (@1 — 20)7]"””. 


Its direction cosines are A = (zı — 2o)/d, u = (Yı — Yo)/d, and 
v = (zı — Zo)/d. 


For short, we will speak of the directed line segment PoP, as being 
from Po to P, rather than always referring to its initial and terminal points. 
While the coordinates of any point on a ray from the origin form a set of 
direction numbers for that ray, we say that a directed line segment whose 
initial point is at the origin and whose terminal point is at (l, m, n) has 
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direction numbers (l, m, n) and only these. That is, a directed line seg- 
ment will have only a single set of direction numbers. Note also that 
the length of a directed line segment is the square root of the sum of the 
squares of its direction numbers, which is exactly our formula for the 
distance between the initial and terminal points. 

The direction numbers of a directed line segment determine completely 
its direction in space and its length.. Also, if we make any parallel trans- 
lation, the direction numbers of a given directed line segment remain un- 
changed. This is usually expressed by saying that the direction numbers 
of a directed line segment are invariant under translation. 

As an example, consider the directed line segment from A = (7, —8, 2) 
to B = (1,0,4). Its direction numbers are (1 — 7,0 + 3,4 — 2) or 
(—6, 3, 2). The length of this line segment is |AB| = [36 + 9+ 4]!/? = 7. 
Therefore its direction cosines are (—%, #, #). 

The parameter ¢ in Definition 3-7, just as in Definition 3-5, represents 
a ratio of distances. In fact, making use of the definition of distance we 
can prove 


Theorem 3-1. Let Po = (£o, Yo, 20) and Py = (21, y1, 21) be two dis- 
tinct points. Let ¢ be any real number between zero and one. Then the 
point 


X = (xo(1 — t) + zit, yo(l — t) + yit, 2o(1 — t) + 218) (3-3) 
on the directed line segment PoP, has the property |PoX|/|PoP:| = t. 


Proof: To prove this theorem, we merely need to compute 
|PoX|? = (zıt — rot)? + (yit — Yot)? + (zıt — zot)? 
= t’[(zı — to)? + (y1 — Yo)? + (zı — 20)7] 
= |PoP,|’, 


which is equivalent to the desired result. 


Observe that the point X given in this theorem can also be characterized 
by the fact that it divides the line segment PoP, in the ratio t/(1 — t). 
Noting this fact makes it easier for some students to remember (3-3) in 
the form 


X = (Lo + U1 — Lo), Yo + UY1 — Yo), zo + t(zı — 2o)). (3-4) 


It is worthwhile learning this formula for the special case when t = 4. 
The point in question is then the midpoint of the line segment PoP, and 
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from (3-3) we see that the coordinates of this midpoint are the averages 
of the corresponding coordinates of the endpoints of the segment. 

For example, the midpoint of the segment AB, where A = (7, —3, 2) 
and B = (1,0,4) isC = (4, —#, 3). This result can be written down by 
inspection, but other division points usually take more computation. 
Thus, the point D, one-third of the way from A to B, has coordinates 


as calculated from formula (3-4). 

The above definition can be extended easily to define a straight line. 
The formal discussion of straight lines will be postponed until Section 4-4, 
but we give the definition here so that we will be able to speak of straight 
lines if necessary. 


Definition 3-8. Let Po = (£o, Yo; Zo) and P, = (zı, Yi, 21) be two dis- 
tinct points. Then the straight line through Po and P; is 


{(z, y, z) | x E To(l — t) ae Lyf, Us yo(1 _ t) TF Yıt, 
z = Zo(l — t) + zıt, t any real number}. 


PROBLEMS 
1. What are the direction cosines of the directed ray from the origin through the 
following points? 
(a) (2, 6, 3) (b) (7, 3, —5) 
(c) (—1, =l 5) (d) (2, —2, —1) 


2. Find the direction numbers, length, and direction cosines of the directed line 
segments: 
(a) From (3, 1, 7) to (—2, 5, 3) (b) From (1, 1, 1) to (7, 2, 5) 
(c) From (0, 1, 1) to (1, 0, 1) (d) From (1, 1, 1) to (—1, 0, —1) 


Oo 


. Let R be the ray from the origin through the point (l, m, n). Let (V,m, n’) 
be any point (other than the origin) on this ray. Prove that the ray, R’, 
from the origin through the point (l, m’, n’) is identical to R. 


Remark: Definition 3-5 defines a ray as a certain set of points, depending on 
a given point. What you are asked to show is that the resulting set of points 
is the same, no matter what point of the ray we start with. Note that in order 
to show that two sets R and R’ are the same, you must show that any point 
in R is also in R’ and, conversely, that any point in R’ is also in R. 


4. In each of the coordinate axes, the set of all points which have nonnegative 
coordinates constitutes a ray. What are the direction cosines of these three 
rays? 
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5. Show that the direction numbers of a directed line segment are invariant 
under translation. 


6. Find the midpoints of the directed line segments of Problem 2. Also find the 
points which are 4 and % of the distance from the initial point to the terminal 
points. 


7. Let R be a ray from the origin through (l, m, n). Let A, u, and v be defined 
as in (3-2). 

(a) Show that (A, u, v) is at distance 1 from the origin. 

(b) Show that any other point of the ray is at a distance other than 1 from 
the origin. [Hint: Use the result of Problem 3 above to express the points 
of the ray in terms of X, u, and ».] 

8. Let A = (aı, a2, a3), B = (bı, b2, b3), and C = (c1, c2, c3) be three distinct 
noncollinear points in space. Consider these three points as the vertices of a 
triangle and let A’, B’, and C’ be the midpoints of the sides opposite the 
vertices A, B, and C, respectively. The line segments A A’, BB’, and CC” are 
therefore the medians of the triangle. Let A”, B”, and C” be the points two- 
thirds of the way from A to A’, B to B’, and C to C’, respectively. 

(a) Find the coordinates of A’, B’, and C’. 

(b) Find the coordinates of A”, B”, and C”. What can you conclude? 


9. What happens to the conclusion of Theorem 3-1 if the point X in (3-3) is 
such that ¢ does not lie between 0 and 1? State and prove a theorem about 
the location of X in relation to Po and Pı fort > lort < 0. 


3-3 VECTORS 


The concept of a vector is a consequence of physical fact. It has long been 
observed that a single number is insufficient to characterize certain physical 
phenomena. For example, a moving object may have a known speed, 
but until we also know its direction we cannot say that we can describe 
its motion. Similarly, force requires both a number (the magnitude of 
the force) and a direction (the direction of application of the force) to 
characterize it. 

Physicists long ago found it useful to introduce a single symbol to repre- 
sent a quantity, such as force, which has both a magnitude and a direction. 
They called such a quantity a vector. From the observed physical behavior 
of such quantities, algebraic operations on vectors were defined. 

For example, if F represented a certain force, say a force of 100 dynes 
directed straight downward, it was found useful to be able to represent a 
force of some different magnitude, say 200 dynes, but still in the same 
direction. It seemed logical to write this second force as 2F, the number 2 
being thought of as doubling the force without changing its direction. Such 
a multiple of a vector was called a scalar multiple of a vector, pure num- 
bers being called scalars to distinguish them from vectors. 
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A second algebraic operation, that of vector addition, was defined in 
terms of the result of applying two forces simultaneously. If a force is 
applied to an object, that object will move unless a second force of exactly 
equal magnitude but pointed in the opposite direction is also applied at 
the same time. A simple physical experiment serves to show how simul- 
taneous forces must be combined. Suppose two strings are tied together 
at a point P, and a weight is suspended by a third string from the same 
point P (see Fig. 3-8). The forces on the point P are applied through the 
strings, and hence must be in the direction of the strings. The weight exerts 
a known force through the vertical string on the point P. Since the point 
P is not moving, the resultant of the forces exerted by the two supporting 
strings must exactly match the downward force exerted by the weight. 
That is, it must be as shown by the dashed arrow in Fig. 3-8, where the 
length of the arrow represents the magnitude of the force. 


y 
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If the forces exerted by the two supporting strings are measured by 
introducing spring balances into the strings for example, it is found that 
the forces line up as shown in the insert of Fig. 3-8. In other words, if we 
draw lines in the direction of the forces, whose lengths are equal (in some 
units) to the magnitudes of the forces, then the resultant of these two 
forces is represented by the line which forms the diagonal of the parallelo- 
gram determined by the given lines. 

Such a diagram, called in physics the parallelogram of forces, determines 
the way in which two forces combine. The same diagram is then used to 
define (physically) the sum of two vectors. 

While the above description of vectors in physical terms is quite satis- 
factory for the use that is made of them in elementary physics, it leaves 
much to be desired mathematically. In particular, it has been found de- 
sirable to generalize the idea of a vector to situations in which it is im- 
possible to give an accurate definition of what is meant by “direction.” 
For this reason we would like to give a mathematical definition of a vector, 
which can then be used without reference to physical intuition. 

To a mathematician, the only really satisfactory way to introduce 
vectors is by the axiomatic method, since only in this manner can he make 
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the broad generalizations, which have been found so useful. This method 
of introduction corresponds exactly to what is frequently called an “op- 
erational definition”; that is, vectors would be defined only in terms of 
how they behave. A set of postulates would be given and a vector would 
be anything that satisfied these postulates. 

Here, however, we will begin in a more modest manner and merely try 
to define something which corresponds to the physical description of a 
vector. At a later stage we may then return to obtain the abstract defini- 
tion. The above physical description will be held in mind throughout. 
The mathematical development will be motivated by this physical de- 
scription. 

Let us consider then what is required. The physical description asks 
for a quantity with both direction and magnitude. We may take our clue 
from the arrows that physicists use to represent vectors and observe that 
a directed line segment satisfies our requirements. There is still a diffi- 
culty, however. Two directed line segments with the same directions and 
lengths may be distinct, yet they would represent the same vector. This 
could be taken care of by using only directed line segments with initial 
points at the orgin, but it is convenient to be able to associate vectors with 
arbitrary directed line segments. All we need to do is find a property of a 
directed line segment which determines its direction and magnitude but 
which is independent of translation. The set of direction numbers of the 
directed line segment satisfy this requirement. 


Definition 3-9. A vector is a triple of numbers. Vectors will be denoted 
by boldface letters. The vector 0 = [0, 0, 0] is called the zero vector. 
The three numbers @,, a2, and az are called the components of A. Given 
a vector A = [aı, a2, a3], the quantity 


|A| = [aj + a3 + a3]? 
is called the magnitude of A. 


In this text, we follow the common practice of indicating vectors by 
means of boldface type. In handwritten work it is difficult to indicate 
boldface letters; so some other convention is usually used. Most students 
find it easiest to indicate a vector by placing an arrow or bar over the 
letter, but any consistent usage is satisfactory. 

Note that by virtue of the above definition, if two vectors differ in any 
one of their components, they are different. We therefore write A = B 
if and only if the two triples of numbers are identical. 

If we compare the definition of a vector with the definition of a point 
in space, we see that they are identical. How can this be? Surely they 
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must be different? As suprising as it might seem, there is in reality no 
essential difference. The set of all vectors, as defined here, constitutes a 
vector space which we can identify in a natural manner with our three- 
dimensional euclidean space. We could make this identification complete 
and consider only one space, but because of the possible applications, 
it is useful to have both the three-dimensional space and the vector space 
as two distinct entities. On the other hand, many of the properties of 
three-dimensional space can be expressed most easily in terms of vectors, 
so it would be convenient to be able to make this identification when 
needed. For these reasons, we will introduce the following conventions. 


Convention 3-1. The same triple of numbers may represent either a 
point or a vector. When it is meant to represent a point, the triple will 
be enclosed in parentheses, 


A= (ay, ae, a3). 
When it is meant to represent a vector, it will be enclosed in brackets, 
A = lai, a2, a3]. 


The same letter in boldface or italic type will be used to represent the 
same triple of numbers, considered as a vector or a point, respectively. 


These conventions identify the vectors as points in space or, perhaps 
more accurately, as directed line segments with their initial points at the 
origin. This point of view produces what physicists call “bound vectors.” 
Physicists also found it useful to have “free vectors,” which are thought 
of as directed line segments with arbitrary placement. To accommodate 
this idea, we introduce a function which associates a vector to each di- 
rected line segment. Note that in this definition, although each directed 
line segment is associated with a unique vector, many different line seg- 
ments may be associated with a single vector. 


Definition 3-10. Let P, and P> be two given points and let P,P, be 
the directed line segment from P, to Pz. Then the vector P,P, is the 
vector whose components are the direction numbers of the directed line 
segment P,Po. 


Thus, for example, the directed line segment AB, where A = (3, 5, 2) 
and B = (—1, 6, 5) has associated with it the vector AB = [—4, 1, 3]. 
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In most of our work with vectors, we will find ourselves treating directed 
line segments as though they were vectors. Much of the time the difference 
is not really important. But remember, a directed line segment is not a 
vector. A vector is really a property of a directed line segment. The vector 
determines the direction and length of the directed line segment but not 
its placement in space. 


PROBLEMS 


1. Let the point A have coordinates (a1, a2, a3) and suppose that the vector 
AP. = [b1, be, ba}. 


What are the coordinates of the point P? 


2. If the weight in Fig. 3-8 is 100 grams and the two supporting strings make an 
angle of 90° with each other, find the force exerted by each string in terms of 
the angle between that string and the vertical. 


3. Let A and B be two points of space and suppose that under a translation they 
are translated to A’ and B’ respectively. Prove that 


—_>_—> —sS—'_ -—————> 


AB = A'B’. 


3-4 THE ALGEBRAIC OPERATIONS ON VECTORS 


As mentioned in the previous section, it is found useful to have an opera- 
tion which changes the magnitude of a vector without altering its di- 
rection. This operation, called scalar multiplication, is easily found 
from the comparison of directed line segments of different lengths which 
are on the same ray. 


Definition 3-11. Real numbers are called scalars. Given a vector 
A = [a), a2, a3] and a real number (scalar) t, the scalar multiple of 
A by tis 

tA = [ta;, tas, tas]. 


Geometrically, scalar multiplication can be thought of as multiplying 
the length of the vector (directed line segment) by the number t. Note 
that if a directed line segment has nonzero length and is represented by 
a vector A, then its length is |A| (which is a scalar) and the vector A/|A| 
has length one. The components of A/|A| are the direction cosines of the 
directed line segment. 
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From Problem 1 of the last section, we see that vector addition in geo- 
metric terms corresponds exactly to the addition of vectors by com- 
ponents. Thus we have 


Definition 3-12. Given two vectors A = [a), a2, a3] and B = [by, bo, b3], 
the sum of these two vectors is the vector 


A + B = [a, + bı, a2 + b2, a3 + b3). 


Another way of expressing the parallelogram law is to make use of di- 
rected line segments. Letting O represent the origin and letting A and P 
be points such that OA = A and AP = B, we have 


A +B = OA + AP = OP. 


Figure 3-9 illustrates these relationships. It should also help the student 
to see how componentwise addition gives the sum of the two vectors. 





FIGURE 3-9 


A special case of a scalar multiple of a vector is worth noting. Given 
any vector A = [q), do, a3], the vector (—1)A = [—a,, —a2, —ag3] is 
such that the sum of it and the vector A is exactly the zero vector. It is 
useful to have a special notation for this vector and other negative forms. 


Definition 3-13. Given any vectors A and B, we shall use the following 
notational conventions: 
—A = (—1)A, 
A — B = A + (—B). 


The expression A — B is called the difference between the two vectors 
and has a special geometric interpretation. In Fig. 3-10(a) we see the 
parallelogram with sides A and —B (dashed). The sum A + (—B) is 
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then the dashed diagonal which is parallel and equal in length to the solid 
diagonal. Therefore, the difference A — B is exactly the vector BA, 
where B and A are the points associated with the vectors B and A respec- 
tively (see Fig. 3—10b). 

Although this geometric interpretation of the difference of two vectors 
seems to be purely intuitive, we can actually prove that it is valid (in 
terms of the definitions which have been given). 





(a) (b) Figure 3-10 


AB + BC = AC (3-5) 
and 
AB — CB — CA. (3-6) 


Proof: Let A = (aj, d2,a3), B = (bı, b2, b3), and C = (cı, c2, c3). 


Then 
AB = [bi — Ai, be — Q2, b3 = a3], 


BC = [c, — by, ce — be, cz — bs], 
and 


——> 
AC = [c, — ay, cg — dg, cz — az}. 


It is then clear from Definition 3-12 that Eq. (3-5) is true. 
To prove Eq. (3-6), we could proceed in the same way, or merely 


observe that AC = —CA, and hence that 

CB — CA = CB + AC 
AC + CB 
— AB, 


where the last step follows from (3-5), which we have already proved. 
In the middle step we made use of the commutativity of vector addition, 
which we have not proved, but which follows immediately from Defini- 


tion 3-12. 
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An important fact to note is that by virtue of Definition 3-10, we have 
the representation of a directed line segment in the following form: 


AB = B — A. (3-7) 


This result can also be viewed as a special case of (3—6) when we realize 
that Convention (3-1) and Definition 3-10 can be thought of as saying 
that A = OA, and B = OB, where O is the origin. 


In our informal discussions, we introduced scalar multiplication so as 
to change the magnitude of a vector while leaving its direction unchanged. 
That is, the new vector is parallel to the old. We would now like to make 
formal definitions which will allow us to speak of parallel line segments 
and vectors. 


Definition 3-14. Two directed line segments are called parallel if and 
only if they have the same set of direction cosines. Two vectors are 
called collinear if and only if one is a scalar multiple of the other, and 
are called parallel if and only if one is a nonnegative scalar multiple of 
the other. 


The definition of parallelism made here differs from that made in most 
elementary courses. Two directed line segments which would ordinarily 
be called parallel may not be called parallel according to this definition. 
For example, if we have A = (1,1,1), B = (1,0, 2), and C = (2,1,3), 
then OA = BC and the directed line segments OA and BC are parallel, 
but the directed line segments OA and CB are not called parallel. In 
order to be called parallel, two directed line segments must have the same, 
not opposite, directions. 

The above definition is consistent. That is, two directed line segments 
AB and CD are parallel if and only if the associated vectors AB and CD 
are parallel. The notion of collinearity does not fit well with thinking of 
vectors as arbitrarily located directed line segments. It is, however, 
geometrically evident when we think of vectors as directed line segments 
whose initial points are at the origin. Two line segments (not directed) 
are parallel if and only if the associated vectors are collinear. 

The zero vector, the vector which has zero for all its components, oc- 
cupies a peculiar position with regard to this definition. The way we have 
stated the definition, the zero vector is parallel (and collinear) to any 
vector. We could have avoided this by using nonzero and positive scalar 
multiples in the above definition, but it will turn out to be convenient to 
have the definitions in the form actually given. Remember that parallel 
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vectors (disregarding the zero vector) have the same direction, while 
collinear vectors have the same or opposite directions. 

Now let us observe the algebraic laws satisfied by vector addition and 
scalar multiplication. There are many such laws, but we choose the 
following set for emphasis: 


Theorem 3-3. Vector addition satisfies the following properties: 


P1. For any vectors A and B 
A+B=B+A. 
P2. For any vectors A, B, and C 
A+ (B+ C) = (A+B) +C. 


P3. There exists a vector 0 such that for any vector A 
A+0= A. 
P4. For any vector A there exists a vector —A such that 


A + (—A) = 0. 


These properties are (in view of the definitions we have given) fairly 
self-evident. The proofs are easy and can be left as problems for the 
student. The student will recognize these as properties of addition in the 
real number system. Since they are the same properties, they are referred 
to by the same names. That is, P1 is called the commutative property for 
addition of vectors, P2 is called the associative property for addition of vectors, 
P3 is the existence of the identity, and P4 is the existence of inverses. 

The student who has heard of a group will see that Theorem 3-2 can be 
rephrased more simply as: The set of vectors forms a commutative group 
with respect to addition. If you have never heard of this concept, you need 
not worry about it. Merely realizing that this particular set of properties 
occurs often enough to deserve a special name will be enough at this point. 

When we introduce scalar multiplication we obtain further properties 
of the set of vectors. Note that if we try to verify properties similar to 
those held by the real number system, we find immediate differences. 
For example, the scalar multiple of a vector is an indicated product of 
completely different entities. It is up to us to define what we mean by the 
indicated product in either order. We obviously wish to define them to be 
the same regardless of order. In other words, the commutative law holds 
by definition. We can, however, obtain a close equivalent to the associa- 
tive property; and the distributive property also applies. In fact, since 
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there are two distinct types of addition (addition of scalars and addition 
of vectors), there are two distributive properties. 


Theorem 3-4. The algebraic operations on vectors have the following 
properties: 


P5. For any vector A and any scalars s and t, 
(st)A = s(tA). 
P6. For any vector A and any scalars s and t, 
(s+ HA = sA + tA. 
P7. For any vectors A and B and any scalar t, 


(A + B) = tA + tB. 


P8. For any vector A, 
1-A=A. 


The four properties listed here are again very simple to verify. Property 
8 may seem too trivial to mention, but it is included for a specific reason. 
It so happens that the properties P1 through P8 listed in the above two 
theorems are sufficient to characterize an algebraic system of considerable 
mathematical importance—a vector space or, as it is sometimes called, a 
linear space over the real numbers. This is a set of elements, called vectors, 
together with the real numbers and two operations, vector addition and 
scalar multiplication, satisfying the above eight properties (which would 
then be called postulates). The eighth property becomes quite important, 
allowing the complete algebra of vectors to be developed. 

There are many different types of vector spaces satisfying the above 
eight properties, but the system of vectors that we have developed re- 
quires only one additional property to characterize it completely. The 
remaining property concerns linear dependence and will be discussed in a 
later section. 

At this stage, the student need not worry about how additional algebraic 
properties of vectors could be proved from those listed above, but should 
merely use the properties as needed. Any doubts can usually be resolved 
by reference to the definitions. 


To illustrate some of the concepts introduced in this section, let us give 
a proof, using vector methods, of the thoerem: 


The diagonals of a quadrilateral bisect each other if and only if the quad- 
rilateral is a parallelogram. 
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Proof: Suppose that A, B, C, and D are the vertices of a quadrilateral 
with diagonals AC and BD. Let M, and Mz be the midpoints of AC and 
BD, respectively. Then 


M, = OM, = 0A + AM, 
= A+ 4AC 
=A+H(C — A) 
= (A + ©) 


(see also Problem 2). Similarly, 
M>: = ¿(B + D). 


We observe that the diagonals of the quadrilateral bisect each other if and 
only if Mı = Mə», or equivalently M, = M2. 

The theorem we are trying to prove is an “if and only if” proposition. 
This means that there are two separate results to be proved. First, let us 
prove that if the quadrilateral is a parallelogram, then the diagonals bisect 
each other. 

The condition that ABCD be a parallelogram can be restated in the form 
AB = DC (the opposite sides are parallel—make a sketch). Hence 
B — A = C — D, or B = A + C — D. Substituting this into the 
above formula for Mo, we find 


M: = 4(B + D) 
= (A +C -— D + D) 
= A+C) 
= Mı. 


Therefore, we conclude that the diagonals bisect one another. 

Next, we prove the other half of the theorem: if the diagonals bisect 
each other, then the quadrilateral is a parallelogram. 

The assumption that the diagonals bisect each other is equivalent to 
Mı = M, or A + C = B + D. This equation implies that 


A—B=D-C, 
or 
BA = CD, 


which tells us that the quadrilateral is a parallelogram. 

Exactly what has been proved here? This theorem was proved to be 
true in the particular model of euclidean space which we have been con- 
structing. Its truth depends on the concepts of length and parallelism 
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which we have defined. Similarly, if we use these methods to prove other 
geometric facts, let us remember that the proof is really only valid in the 
“vector space” model of euclidean space. It so happens that this model is 
an accurate representation of euclidean space, but proof of this will not 
be given here. 


PROBLEMS 


1. Let O (the origin), P1, P2, P3, and P4 be points in in three-dimensional al space. 
What algebraic condition on the vectors OP), PiPo, PoP, and P3P4 is 
necessary and sufficient for the point P4 to coincide with the origin? 


2. Let A and B be any two distinct vectors. For any real ¢t between 0 and 1, 
let P = (1 — #)A-+ ¢B. Where is the point P in relation to the points A 
and B? [Hint: See Definition 3-10 and compare with Theorem 3-1.] 


3. Verify the properties listed in Theorems 3-3 and 3-4. 


4. Define the vectors e; = [1, 0,0], e2 = [0, 1,0], and e3 = (0,0, 1]. Find 
scalars u, v, and w such that 


ue; + veo -+ weg = A, 


where 
(a) A = (7, 3, —4], (b) A = (1, 0, 3], 
(c) A = [0, 0, 0), (d) A = lan, a2, a3]. 


5. Find u, v, and w to satisfy the same conditions as those of Problem 4 if 
a (1, 1, 0], e2 = (1, =h 1), and e3 = =h l, 2). 


6. Can you find u, v, and w such that 
ue; + vee + weg = [1, 0, 0}, 
if 
(a) e = (1, 1, 0], e2 = [1, =]; 1], e3 = [2, 0, 1], 
(b) el = [0, 0, 1), e2 = [0, l, 1], e3 = [0, =k 2]? 
If not, why not? 
7. For each part of Problem 6, find u, v, and w not all zero such that 
ue; + veo + we3z = 0. 
8. For each of the following vectors, find a vector B with the same direction 
but magnitude 1. 
(c) A = [12, 0, —5] (d) A = [3,5,7]. 


9. Which of the following systems satisfy all eight of the properties P1 through 
P8, and hence are vector spaces? If a system does not satisfy all the proper- 
ties, which fail to hold (in general)? 
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(a) The set of all n-tuples of real numbers A = [a), ao, ..., an] with vector 
addition defined by 


[ai, a2, . . . , an] + (b1, b2, .. . , bn] = [ai + bı, a2 + bo, ..., an + ba], 
and scalar multiplication defined by 
tla, a2,..., an] = [taı, tao,..., tan]. 


(b) The set of all ordered pairs of real numbers A = [a), a2] with vector 
addition defined by 


(a1, a2] + [b1, be] = [a1 + bı, a2 + bə], 
and scalar multiplication defined by 
tlaı, a2) = [taı, 0]. 


(c) The set of all real-valued functions f defined for all z, 0 < x < 1, with 
vector addition defined by f + g being the function whose value at z is 
f(x) + g(x), and scalar multiplication defined by tf being the function 
whose value at z is tf(z). 


10. Let A and B be two points in space. How are ‘AB and BA related? 
11. Let A, B, C, and D be any four distinct points in space. 
(a) Show that 
AD = AB + BC + CD. 


(b) Let E, F, G, and H be the midpoints of the line segments AB, BC, CD, 
and DA respectively. Show that 
EF = 4AB + 4BC 
HG = 4}AD+ 3D0. 


(c) Show that EF = HG. Express this as a geometrical theorem. 


12. Let A, B, and C be any three distinct points in space. Let D be the midpoint 
of the line segment BC. Then AD is a median of the triangle ABC. 
(a) Find AD in terms of the vectors A, B, and C. 
(b) Find A + 4AD in terms of A, B, and £. 
(c) What would happen if the roles of A, B, and C were interchanged in the 
above? What theorem has been proved? Compare with Problem 8 of 
Section 3-2. 


Using vector methods, prove each of the following theorems. 


13. The line segment joining the midpoints of two sides of a triangle is parallel 
to, and one-half the length of, the third side. 


14. The line segment joining the midpoints of the nonparallel sides of a trapezoid 
is parallel to the bases and equal to half the sum of their lengths. 
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15. The line segments joining the midpoints of opposite sides of any quadrilateral 
bisect each other. 


16. The midpoints of two opposite sides of any quadrilateral and the midpoints 
of the diagonals are the vertices of a parallelogram. 


17. The two lines from a vertex of a parallelogram to the midpoints of the 
opposite sides trisect the diagonal that they cross. 


3-5 PROJECTIONS AND THE DOT PRODUCT 


Let L be a given line and AB a given directed line segment in three- 
dimensional space. We wish to make an intuitive definition of the pro- 
jection of AB onto L. To do this, erect planes perpendicular to L through 
A and B and let these planes cut L at points A’ and B’ as in Fig. 3-11. 
We will call the directed line segment A’B’ the projection of AB onto the 
line L. 

In many texts, particularly those in engineering, it is common to find 
the projection defined as the length of the segment A’B’ described above 
rather than the segment itself. We will find it more useful, however, to 
consider the projection as a vector rather than as a scalar. 

It is the vector properties of the directed line segment A’B’ that we 
are really interested in. Note that if we use another line which is parallel 
to L in Fig. 3-11, the resulting projection will be a different directed line 
segment, but the associated vector will be the same. 

Similarly, suppose that we have another directed line segment, CD, 
which is parallel to and of the same length as AB, that is CD = AB; 
then we see that the projection of CD onto L will be a directed line seg- 
ment C’D’ such that C’D’ = A’B’. This can be seen easily by observing 
that the two planes through C and D will be the same distance apart as 
the planes through A and B. From this, it follows that the vector A’B’ 
is determined by the vector AB. For this reason, we will say that the 
projection of the vector AB is the vector A’B’. 

Observe that there is no difficulty here in having two different things 
being called a projection. The projection of a directed line segment is a 
directed line segment, while the projection of a vector is a vector. 

Given a vector and a line, we find the projection of the vector onto the 
line by drawing a directed line segment with that vector, finding the pro- 
jection of that directed line segment, and converting the resulting directed 
line segment into a vector. Since the result depends only on the direction 
of the line L, we might as well assume that the desired line L passes through 
the origin. Thus given a vector A, the obvious directed line segment to 
associate with A is OA. The projection of this onto L will then be OA’, 
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and we denote the projection of A onto L by 
Proj (A) = OA’. 


When we use this notation, the line L is understood. To be completely 
correct, we should indicate the line in our notation, but this would be 
rather cumbersome for our purposes. 

We now wish to investigate (intuitively) some of the properties of this 
projection function in order to learn what the appropriate formal definition 
should be. For these intuitive purposes, it is clearly sufficient to think in 
terms of directed line segments. 





FiaurE 3-11 FiaureE 3-12 


Let us see what the projection of the sum of two vectors is. Let C = 
A + B and suppose we project C onto a line L. 

Observe Fig. 3-12. Here we see the vector C as the sum of A and B, 
and three planes drawn through the appropriate points. It is then clear 
that 

Proj (A + B) = Proj (A) + Proj (B). 


We also observe that for a fixed line L, the projection of t times A is ¢ 
times the projection of A, as is easily seen by observing the similar triangles 
formed when A is drawn as a directed line segment with its initial point 
on L (see Fig. 3-13). Thus, having fixed L, we can write for any vectors 
A and B, and for any scalars s and t, the following i 
relation: 

Proj (sA + tB) = s» Proj (A) + t- Proj (B), 

(3-8) 





which is usually called the linearity property of the | 
projection. Fiure 3-13 
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It is the direction of the line L which is important in determining the 
projection. For a given line L, all projections onto this line will be col- 
linear; that is, they will all be scalar multiples of some one nonzero vector 
which determines the direction of L. Suppose that B is a nonzero vector 
on the line (i.e. that there are two points P and Q on the line such that 
PQ = B). This vector B will then determine a sense of direction on L. 
Think of the vectors A and B as directed line segments drawn from the 
same point and let 0 be the angle between these two (Fig. 3-13 again). 
Then clearly, 

Proj (A) = |A|(cos @)e, (3-9) 


where e is a vector of unit length parallel to B. That is, e = B/|B|, and so 
; B 
Proj (A) = |A|(cos 6) By (3-10) 


Observe carefully that this result is independent of which vector B 
is taken on the line L. If B’ is another nonzero vector, parallel to B, that is, 
a positive scalar multiple of B, then 


Doa, 

|B'] |B] 
If, however, B” is some negative multiple of B (hence is collinear but not 
parallel), then 


B” OB 
|B”]| |B| 
but the angle 6’’ between A and B” will be changed by 7 from 6. Hence 
cos 0” = —cos 0 and the net result is that 
i |Al(cos 6)B _ |A|(cos 6’)B” 
Pro A SS See eee eee 
1B B7] 


Let A = [aı, a2, a3] be a given vector and let B = [b;, b2, b3] be a 
nonzero vector which determines the direction of a line L. We wish to 
find the projection of A onto L. We will do this with the help of the 
linearity property (3-8), breaking A up into the sum of three vectors, 
each in the direction of one of the coordinate axes. 


Definition 3-15. The unit coordinate vectors are the vectors 


e1 = (1, 0, 0], € = [0, 1, 0], and e3 = [0, 0, 1). 
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With this definition, we can write : 
A = aye; + a2 + aze3 
(see Fig. 3-14). This could also be written 


in compact form by using the summation 


symbol 
3 
A = >> aie. 
i=1 





FIGURE 3-14 


Now we need the projections of the three unit vectors. From (3-9), the 
projection of e, in the direction of B is |e;|(cos a)e = (cos a)e, where 
e = B/|B| and a is the angle between the z-axis (e,) and the vector B. 
However, cos a is exactly the first direction cosine of B, or cos a = 6,/|BI. 
In exactly the same way the projections of e€ and e3 can be seen to be 
(b2/|B|)e and (b3/|B|)e respectively. 
Therefore, using the linearity property, we find that the projection of A 
onto L is 
Proj (A) = Proj (a1e1 + ade€2 + azez) 
= a; Proj (€1) + a2 Proj (e2) + a3 Proj (e3) 
a,b, Aobe a3b3 
= 6 a Oa aa e 
[BI [BI [BI 


= (a:b; + aebe + a3b3) By 
B 
= (a,b; + dabe + a3b3) Be 


On the other hand, if @ is the angle between A and B, this same projection 
is given, according to (3-9), by 


Proj (A) = |Al(cos 6)e = |A|(cos 6) Bi 
Comparing these two results, we find the interesting relation 
|A| |B| cos 0 = aıbı + azb2 + az3bs, (3-11) 


or, using the summation notation, we have 


3 
|A| |B| cos 6 = 2 aibi. 


i=1 


All of this discussion has been based on our understanding of the ge- 
ometry of three-dimensional space. We want our algebraic development to 
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correspond to the familiar geometry of space. We do this by seeing that 
our definitions correspond to the above results. In particular, we would 
like to have a simple symbol which would represent the combination of 
two vectors which appears on the right-hand side of (3-11). This then, is 
our motivation for the following definitions. 


Definition 3-16. Given two vectors A = [a, a2, a3] and B = [b,, bo, bal, 
the dot product of these two vectors is 


A-B = a,b, + Gabo -+ azbz3. 


The projection of A onto a line LZ whose direction is determined by a 
vector B ¥ 01s 
(A- B)B 


Proj (A) = Bl 


The cosine of the angle between the two vectors is defined to be 


A-B 
|A] [BI 


provided A = 0 and B + 0. In particular, the two vectors A and B 
are called orthogonal (or perpendicular) if and only if A -B = Q. 


cos 6 = 


Several remarks about this definition are in order. First, the quantity 
defined to be the cosine of the angle between the vectors has not been proved 
to be the actual cosine of a real angle. In order to know this, we would 
have to know that the quantity so defined always is between —1 and +1 
or, what amounts to the same thing, that for any two vectors A and B 


|A- Bi < |A|- |B. 


This happens to be true, as will be proved in the next section, so that we 
can use the above definition to define the angle between two vectors. The 
orthogonality condition then corresponds correctly to our usual ideas of 
orthogonality. 

A second observation is that according to the above definition, the zero 
vector, 0, is orthogonal to every vector. While this may seem unusual, it 
turns out to be a useful convention and so we will use it. 

Finally, the unit vectors e,, €2, and e3, defined on p. 108, are mutally 
orthogonal according to this definition. Since we have been basing our 
algebraic development on the geometric picture of mutually orthogonal 
coordinate axes, this fits in with our picture. 

The use of the name dot product for the type of product defined above 
is based only on the way we write the product. Other names for this 
concept are inner product and scalar product. The expression “inner 
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product” is probably the most suitable mathematically, but we will find 
that the term “dot product” will serve our purposes. It is essential to give 
the full name, however. In a later section we will define a different type of 
product, which we will call the cross product, and with two types of product 
it is necessary to specify which is being used. 


As examples of the use of these concepts, let us suppose that A = 
[1, —4, 8], B = [—7, 6, 6], and C = [—4, 15, 8], and find 

(1) A-B; 

(2) The cosines of the angles between A and B and between A and C; 


(3) The projections of B and C onto the line whose direction is de- 
termined by A. 


We solve these problems as follows: 


(1) A-B= (1, =s, 8] í =i; 6, 6] 
= —7 — 24+ 48 
= 7. 
(2) Let 6 be the angle between A and B. Then 
A-B 17 17 
cos 6 = 





[A] [B]  (121]1/2[121]1/2 «121 
Let ¢@ be the angle between A and C. Then 








pn AO: .. eee a 
cos $ = Ae] = ojn T 
So A and C are orthogonal. 
p BAA 17 , _ a —68 136), 
(83) Pro) (B) = “Taiz 7 21AT |R R R 
mw (CAA 0, n 
Proj (C) = ae 7R A = 0. 


Let us now state the fundamental algebraic properties of the dot product. 


Theorem 3-5. The dot product of vectors is such that for any vectors 
A, B, and C, and any scalar s, the following properties are satisfied : 
P!I1.A-B = B.-A. 
P2.A-(B+C)=A-B+A.-C. 
P3. s(A-B) = (sA)-B = A- (sB). 


P4. A-A = |A|? > 0 for every A, and A- A = 0 if and only if 
A = 0. 
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Again, the proofs of these facts are almost trivial and can be left to the 
reader. Note that the first three can be summarized by the observation 
that as far as vector addition, scalar multiplication, and the dot product 
go, the ordinary rules of algebra apply. However, there is one rule which 
is definitely missing—the cancellation law. Suppose we have 


A-B=A.-C, 


and A ~ 0. Then we cannot conclude that B = C. We can say that 
A-B — A-C = 0, and hence using P2, that 


A-(B—C)=0.* 


But from this, all we can conclude is that B — C is orthogonal to A, which 
need not imply B = C. For example, if A = e, = [1,0,0], B = [1, 2, 1], 
and C = [1,1, —2], the student can easily verify that A-B = A-C, 
but clearly B = C. 

The properties listed in this theorem are again standard properties 
which have been found valuable in much more general situations. In 
fact, they appear as the postulates which must be satisfied by the inner 
product in a special mathematical structure of great importance, an inner 
product space. At some time in his career, the student will probably hear 
of a Hilbert Space. A real Hilbert space is nothing more than a vector 
space with an inner product which satisfies the postulates mentioned in 
this and the last section. 

The first property, P1, which is obviously a commutative law, is called 
the symmetric property of the dot product. Property P2, clearly a form of 
the distributive law, and property P3, which appears to be some form of an 
associative law, are together referred to as the bilinearity property of the 
dot product. This is best explained by observing that if we consider the 
vector A fixed, then for any vectors B and C and any scalars s and t, 


A - (sB + tC) = s(A-B) + t(A - ©), 


which is exactly the linearity property mentioned in connection with 
projections. The word “bilinear” is used because this property also holds 
with respect to the first factor in the dot product. 

The property P4 is usually described by saying that the inner product 
is positive definite. It is called positive because A : A > 0 for every A, 
and definite because this inner product is zero only when A = 0. 


* Note that this argument makes implicit use of P3, with s = —1, in order 
to obtain the distributive law for the difference of two vectors. 
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The properties of the dot product given in these theorems can be gen- 
eralized in obvious ways. For example, using the properties of vector addi- 
tion and the distributive law of P2, we can generalize the distributive law 
to the product of the sum of a set of vectors with the sum of another set. 
Thus, for example, 


(A; + A, + Az) - (B; + Be) = A, -B, + A, - Bo + A2 - B; 
+ Ao: Bo + A3-B, + Ag - Bo. (3-12) 


This same result can be written more concisely by using the summation 
notation. Equation (3-12) is equivalent to 


j=1 


The student may not be familiar with the double summation shown on the 
right-hand side of this equation. This may be interpreted as follows: 


A,:-B, + Ai: Bo + Ao: By, 
+ Ao: Bo + A3- B,; + Az: Bo. 


PROBLEMS 


1. Let B be a nonzero vector which defines the direction of a line L. Prove that 
the projection of A onto L is given by 


Proj (A) = (A: e)e, 
where e = B/|BI. 


2. Find the projection of the vector A = [7, 1, —4] onto the lines determined 
by the vectors 


(a) B = (2, 6, 3] (b) B = [2, 2, —1] 
(e) B = [—7, =a; 3] 


3. Find the projections of the vector A = [2, —4, 3] onto the lines determined 
by the vectors 
(c) B = [—8, 3, 6] (d) B [—8, 16, —12] 
(e) B = [2, —4; 0] 
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4. What are the projections of A = [aj, a2, a3] onto the coordinate axes? 


5. Find the cosines of the angles between the vector A = [3, —2, —6] and the 
vectors B, where 


(a) B = [4, = 8] (b) B = [4, 7, —4] (c) B = (12, 4, 3] 
6. Find the angle between the vector A = [2, —1, 2] and the vector B if 
(a) B = [—1, 2, 2] (b) B = [—6, 3, —6] (c) B = [0, 1, =l]; 


7. Find some nonzero vector B orthogonal to A = [1, 2, 5]. 
8. Find a vector C orthogonal to both A = [1, 2, 5] and B = [2, 0, —1]. 


9. Prove Theorem 3-5. Note that the proof here is to be purely algebraic, 
depending only on Definition 3-16. 


10. Prove that the four properties of Theorem 3-5 hold for the vector space of 
Problem 9(a), Section 4, if the inner product is defined as 


A.-B = a,b; + aabe + °°: + anban. 
11. The law of cosines can be stated as follows: if three sides of a triangle are of 

lengths |A|, |B|, and |C], and if the angle opposite the side of length |C] 
is 0, then |C/? = |A|? + |B|? — 2)A| - |B| cos 0 (see Fig. 3-15). 
(a) Assume the law of cosines, let A = [a1, a2, a3], 

B = [b, be, 63], and suppose that the point O in 5 c 

Ze s 

(b) Assume the properties of Theorem 3-5 and com- 

pute |C|? = |A — B|?. Compare with the law 

of cosines. Fıcure 3-15 


Fig. 3-15 is the origin. Find C and use the above 
formula to compute |A| |B| cos 8. 





Remark: Parts (a) and (b) of this problem together show that the law of cosines 
is equivalent to the formula A-B = |A| |B|cos@ (granting the algebraic 
properties of the inner product). 


3-6 THE TRIANGLE INEQUALITY 


In Problem 11 of the last section, it was shown that the formula A- B = 
|A| |B| cos ð is equivalent to the law of cosines. In this section we will 


prove the inequality 


which was needed in the last section to justify the definition of the cosine 
of the angle between two vectors, and we shall show that this inequality 
is essentially equivalent to the familiar geometric fact that the length of 
one side of a triangle is less than the sum of the lengths of the other two 
sides. 
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The above inequality is very important in many different situations. In 
general vector spaces it is called the Cauchy-Schwarz inequality. 


Theorem 3-6. (The Cauchy-Schwarz Inequality.) Given any two 
vectors A and B, 
|A -B| < |A| |B|. (3-14) 


Proof: Observe carefully that in the following proof we make use of only 
those properties of vectors given in Theorems 3-3, 3-4, and 3-5. 
Let ¢ be any scalar. Then we may compute 


|:A + B|? = (tA + B) - (tA + B) 
= {?(A - A) + 24(A - B) + (B - B) (3-15) 
= t?|A|? + 2¢(A-B) + |B’. 


The right-hand member of this equation is a quadratic in ¢ except when 
the length of A is zero. 
Suppose that |A| = 0, then A = 0 and hence A-B = 0. It then fol- 
lows that inequality (3-14) is true (with equality holding in this case). 
If |A| = 0, then we have a proper quadratic expression in (3-15). We 
may complete the square in this expression. Doing so gives us 


2 2 a2 (A - B)? 2 (A-B)’ 
A + BF = ¢*|Al* + 2¢(A-B) + INE + |B INE 


s io) 1 papi — a. pe? 
= [aa + SPT + varie — (0-3) 











The left-hand side of this equation is nonnegative for any t. Hence 
this must also be true of the right-hand side. In particular, the smallest 
possible value of the right-hand side must be greater than or equal to zero. 
However, the right-hand side is the sum of two terms, one of which is 
squared. It will take on its smallest possible value when the squared 


term is zero, that is, when £ = — (A - B)/|A|?. We thus conclude that 
1 
TE [IA|*IBI? — (A- B)*] > 0, 


or, since |A|? > 0, 
(A - B)? < |A]? |BI?, 


which is equivalent to the Cauchy-Schwarz inequality. 

If the student feels that the above proof depends upon a trick, he is 
correct. While most proofs that he has seen earlier were probably fairly 
obvious, it might not be clear how a proof such as this might be discovered. 
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The reason for this is that there is no easy method for finding such proofs. 
One must discover the special device (or trick) that makes the proof 
possible. Once one has seen such a proof, however, he should be able to 
make use of the same trick in other proofs. 

Now let us consider the sum of two vectors, A + B. 
If we draw a picture showing the physical realization 
of this vector sum (Fig. 3-16) and use the fact that 
the sum of the lengths of two sides of a triangle is 
greater than the length of the third side, we have 





Figure 3-16 


|A + B| < |A| + [BI (3-16) 


This inequality, for obvious reasons, is called the triangle inequality. The 
most interesting point is that we do not have to assume its truth or de- 
pend on the geometric discussion given here. We can prove this inequality 
using only the algebraic properties of vectors. 


Theorem 3-7. (The triangle inequality.) For any two vectors A and B, 
|A + B| < |A| + |B|. 
Proof: Let us compute the square of the left-hand member of this in- 
equality. This is 
|A + B|? = (A + B) - (A + B) 
= |A|? + 2A - B + |B}?. 
From the Cauchy-Schwarz inequality, A -B < |A| |B|, and hence 
|A + B|? < |A|? + 2|A| |B| + |B|? 
= (|A| + |B)’. 


This last inequality implies the triangle inequality. 

An important thing to note about the work in this section is the use of 
the fact that |R|? = R-R to find the length of a vector which is given 
as the sum of two or more vectors. Let us give a few other examples of 
the use of this fact. 


Theorem 3-8. 
|A + B|? = |A|? + |B|? 


if and only if A and B are orthogonal. 


Proof: We compute (exactly as above) 


|A + Bi? = |A|? + 2(A - B) + |A]?. 
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The conclusion of the theorem then follows immediately upon application 
of Definition 3-16. 

This same algebraic device can be used to give vector proofs of theorems 
involving the lengths of segments (and sometimes also theorems involving 
angles). For example, let us prove the following: 


The medians to the equal sides of an isosceles triangle are equal in length. 


Proof: Let A, B, and C be the vertices of the triangle and suppose that 
|AB| = |AC|. The midpoints of the equal sides are given by 4(A + B) 
and 3(A + C), and so the lengths of the medians are |3(A + B) — C| 
and |4(A + C) — B|. To prove these two equal we compute 


|3(A + B) — Ci? — |§(A + C) — BI? = [9/A|? + 4/BI? + |C]? 
+4A-B—A-C—B-C] 
— [4lA|? + 4lC|? + |BI? 
+3A-C—A-B—B-C] 
= iC? — qB|? + 3A -B — ga-C 
= 3{|C|? — |B|? + 2A -B — 2A-C]. 


However, by the assumption of the equal length of the two sides we have 


0 = |C — A|? — |B — 4}? 
= |C|? — 2A - C + |A|? — |B|? + 2A -B — |A/? 
= |C|? — |B|? + 2A -B — 2A - C. 


Putting this into the above equation then proves the theorem. 

We remark that this proof could have been simplified by proper choice 
of notation, but this proof is offered to show how the result can be obtained 
even if no special care is taken. In the proof of the next theorem we show 
how one can choose the notation so as to simplify the computations. Let 
us prove: 


The diagonals of a rhombus intersect at right angles. 


Proof: Let A be a vertex of the rhombus and let R and S be the vectors 
associated with the two directed line segments forming the sides of the 
rhombus meeting at A. Then three of the vertices are given by A, A + R, 
and A + S. 

Since a rhombus is a parallelogram, the fourth vertex is given by 
A +R +S. The sides of a rhombus are all equal. This tells us that |R| = |S]. 
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The vectors giving the diagonals are 


D, = (A+R+S) -—-A=R-+S 
and 
D, = (A+ R) — (A+ S)=R-S. 


To show that these two diagonals are orthogonal, we compute 


Dı : D2 = (R + S) : (R — S) 
= [R]? — |S]? 


which proves the theorem. 

The reader should also observe how these computations can be simplified 
even further by letting the point A be the origin. He should then go back 
to the previous theorem and see how the calculations would be simplified 
if the vertex A of the triangle were at the origin. 


PROBLEMS 


1. Assume that the triangle inequality is true. By comparing the expansion 
of |A + B/? and (|A| + |B|)?, prove the Cauchy-Schwarz inequality. 


2. Write the Cauchy-Schwarz inequality in terms of the components of the 
vectors involved. 


3. Verify that the Cauchy-Schwarz inequality holds for the vector space of 
Problem 9(a) of Section 3-4. Write this out in terms of components. 


4. Show that equality can hold in the Cauchy-Schwarz inequality only if the 
two vectors are collinear. 


5. Under what circumstances can equality hold in the triangle inequality? 
These circumstances can be determined by observing when equality holds 
in the proof. 


6. Write the triangle inequality in terms of components. 


7. Let A = ze1,B = ye; where e; = [1, 0,0]. What are |A], |B|, and |A + B|? 
What does the triangle inequality reduce to? 

8. Let O be the center of a circle, let A and B be the endpoints of 1 a diameter 
of the circle, and let C b be any other point on the circle. Set OA = Rand 


OC = P. What are OB, AC, and BC i in terms of R and P? Prove that 
AC and BC are orthogonal. 


Using vector methods, prove the following theorems: 


9. If the diagonals of a parallelogram are orthogonal, then the parallelogram 
is a rhombus. 


3-6 


10. 


11. 


12. 


13. 


14. 


15. 


16. 
17. 
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The midpoint of the hypotenuse of a right triangle is equidistant from the 
three vertices. 


If the diagonals of a parallelogram are equal in length, then the parallelogram 
is a rectangle. 


The line segments joining the midpoints of consecutive sides of a rhombus 
form a rectangle. 


The line segments joining the midpoints of consecutive sides of a rectangle 
form a rhombus. 


The sum of the squares of the distances from a point P to two opposite ver- 
tices of a rectangle is equal to the sum of the squares of the distances from 
P to the other two vertices. 


State and prove the converse of Problem 14. 
Use Theorem 3-16 to prove the Pythagorean theorem and its converse. 


Prove that the sum of the squares of the lengths of the diagonals of a paral- 
lelogram is equal to the sums of the squares of the four sides. Is the converse 
true? 


4 
Planes and Lines 


4-1 PLANES 


In his Elements, Euclid attempted to define planes. By our present 
day standards his definition leaves much to be desired, but then, it is not 
easy to give a truly adequate definition. For example, if pressed, the student 
might try to define a plane as follows: A plane is a set of points with the 
property that if any two points are in it, then the entire straight line through 
these two points is in it. 

Leaving aside for the moment any questions about the existence of the 
line, there is still a great deal wrong with this attempted definition. Once 
again, the student should attempt to find the difficulty for himself before 
reading further. 

There are many ways in which we could proceed to give a definition for 
a plane, but we wish to choose one which can be generalized easily and 
which leads to useful results in higher dimensions. At the same time, we 
wish to have a definition which is both rigorous and easy to work with. 
The definition attempted above, for example, is inadequate because the 
entire space or a straight line satisfies it. We therefore will not attempt to 
patch up this definition, but will start fresh. 

The fundamental property we will use is that there exists a unique plane 
through a given point, perpendicular to a given line. The student may 
notice that this property has been used in the geometric discussions in 
earlier sections. It therefore is only right that we make this our defining 


property. 


Definition 4-1. Let Po be a given point in space and let A be a given 
nonzero vector. Then by the plane through Po orthogonal to A we mean 
the set of points P = (2, y, 2), 


M = {P | (PoP)-A = 0}. 
Note that according to our conventions, PpP = P — Py. Thus if 


Po = (Zo; Yo, 20) and A = [a, b, c), then PoP = [x — zo, Y — Yo, 2 — 20] 
120 
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and (PoP)-A = a(x — xo) + bly — Yo) + e(z — 29), so the point 
(x, y, 2) is on the plane if and only if 


a(x — zo) + bly — Yo) + e(z — zo) = 0. (4-1) 

If this equation is multiplied out, and we set d = —azo — byo — Céo, 
then we have the equation 

ax + by + cz +d = 0, (4-2) 


which must be satisfied by the coordinates of the points of the plane. This 
last equation has the form of a general linear equation. It is called the 
cartesian form of the equation of the plane. 

Suppose, on the other hand, that we are given a linear equation ax + 
by + cz + d = 0, not all three of a, b, and c being zero. For example, 
suppose a #Æ# 0. Set ro = —d/a, Yo = 0, and zọ = 0. Then this equa- 
tion is equivalent to the equation a(x — 29) + b(y — Yo) + c(z — zo) = 0, 
and hence is equivalent to (PoP) -A = 0 when Po = (%o, Yo, 20), P = 
(x,y,z), and A = [a,b,c]. That is, the set of all points (z, y, z) whose 
coordinates satisfy a nontrivial linear equation constitutes a plane. (A 
nontrivial linear equation is a linear equation in which not all of the co- 
efficients of the variables are zero.) These facts are important enough to 
warrant their collection into a theorem. 


Theorem 4-7. The coordinates of all points on a plane satisfy a non- 
trivial linear equation. Conversely, the set of all points whose co- 
ordinates satisfy a nontrivial linear equation constitutes a plane. 


Let us try to make clear exactly what we mean by saying that a vector 
is orthogonal to a plane. This idea is expressed in Definition 4—1, but 
needs to be made precise. 


Definition 4-2. A vector R is parallel to a plane M if and only if there 
exist points P, and P3 in M such that P,P, = R. 


Definition 4-3. A vector which is orthogonal to every vector parallel 
to a plane M is said to be orthogonal to M. 


These definitions should be clear enough. In order.for the terminology 
of Definition 4-1 to correspond to these definitions, we must have the 
following theorem: 


Theorem 4-2. Let M be the plane through Po, orthogonal to A, as in 
Definition 4-1. Then A is orthogonal to M in the sense of Definition 4-3. 
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Proof: Let P, and Pz be any two points of M. From Definition 4-1 
we have 
A - (PoP1) = A: (Pi — Po) = 0, 
A-(PoP2) = A: (P2 — Po) = 0. 
However, then 
A-(P,P2) = A- (Pz — P) 
= A-[(P2 — Po) — (Pi — Po)] 
= A. (P2 — Po) — A- (Pi — Po) 
= 0, 


which proves the theorem. 


There is essentially only one vector orthogonal to a given plane. That 
is, any two vectors orthogonal to the same plane must be collinear. We 
can obtain this result more easily at a later stage, but since it seems to be 
relevant at this point, let us prove it here. 


Theorem 4-3. Let M be the plane through Pp orthogonal to A. If B 
is orthogonal to M, then A and B are collinear. 


Proof: The vector A ~ 0. If A = [aj, ao, a3], then one of the three 
components is not zero. Let us suppose for definiteness that a, ~ 0. 
The proof would be similar in other cases. Let 


Now P = (2, y, 2) is a point of the plane if and only if 

a;(% — Xo) + az(y — Yo) + a3(z — zo) = 0. (4-3) 
Multiplying this equation by t, we have, since ta; = bı, 

bi(x — zo) + taz(y — Yo) + taz(z — zo) = 0. (4-4) 


However, if B is orthogonal to the plane, it must be orthogonal to P — Po, 
and hence 
bi(x — zo) + b2(y — Yo) + ba(z — zo) = 0. 


Subtracting (4-4) from this, we see that for any point P on the plane, 


(b2 — ta2)(y — Yo) + (b3 — taz)(z — 29) = 0. (4-5) 
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We now make use of this relation by choosing particular points on the 
plane. First, set 


= ty — = yo + 1, Zz = Zo. (4-6) 


Substituting these values into Eq. (4-3), we see that the point (x, y, z) is 
on the plane. When these values are put into (4-5), however, we find 
bo = tao. 
In the same way, if we set 
a | 

t= 2 — yY = Yo; z = žo + 1l, (4-7) 
we again have a point on the plane, and Eq. (4-5) gives us b = taz. We 
therefore have shown 


bı = tay, b> = tas, b3 = tag, (4-8) 


or B = tA, which proves the theorem. 


Theorems 4-2 and 4-3 together show that, exclusive of scalar multiples, 
there is only one vector orthogonal to a given plane. Suppose, on the 
other hand, that two nonzero vectors A and B are collinear and that a 
point Po is giyen. Are the planes through Po orthogonal to A and B the 
same? The answer is, of course, yes. 


Theorem 4-4. If A and B are nonzero collinear vectors and if Po is a 
fixed point, then the plane through Po orthogonal to A is identical to 
the plane through Po orthogonal to B. 


Proof: Let B = kA, where k = 0 (why is this possible?). Let 


Mı = {P |A. (P — Po) = 0}, 
and 
M» = {P |B. (P — Po) = 0} 
= {P | kA . (P — Po) = 0}. 


However, since k = 0, the expression in the last line can be zero if and 
only if A - (P — Po) = 0, and hence we can conclude that M, = Mo. 
The equation, in vector form, of the plane through a point Po orthogonal 
to a given vector A is 
A-(P — Py) = 0. 
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By making use of the algebraic properties of the dot product, this can 
also be written in the form 


A-P—A-P, =0, (4-9) 


or equivalently, in the form 
A- P = A- Po 


This last form is quite easy to remember, since it is obvious that P = Po 
satisfies this equation. Form (4-9), however, corresponds exactly to 
(4-2), since if A = [a, b, c] and P = (z, y, z), then 


A -P = ax + by + cz, 


while A - Po is a constant which can be identified with —d, d being the 
constant in (4-2). 

Suppose, for example, that we wish to find the cartesian equation of the 
plane through P» = (1, 2, 3) orthogonal to A = [3, —4, 1]. We have 


A-P= 3x — 4y +2, 
while 
A -Po = 3 — 8 + 3 = —2, 


and hence the desired equation is 
3r — 4y +z 4+2 = 0. 
Conversely, if we are given an equation such as 
2x + y — 3z — 5 = 0, 


we recognize it as the equation of a plane orthogonal to A = [2, 1, —3]. 
But what point does it pass through? All we need do is to find any Po 
whose coordinates satisfy the given equation. In this case, a simple 
choice is Py = (0, 5,0). This equation is, therefore, the equation of the 
plane through (0, 5, 0) orthogonal to [2, 1, —3], and is equivalent to 


[2, 1, —3] f ([x, Y, z] — [0, Õ, 0}) = 0. 


PROBLEMS 


1. Let M be the plane, as defined above, through the point Po orthogonal to 
the vector A. Let Pı be another point in M and let Mı be the plane through 
Pı orthogonal to A. Prove that M = Mı. (Recall that in order to show 
that two sets are the same we must show that any point in the first is also 
in the second, and any point in the second is also in the first.) 
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. Find the equation, in standard form az + by + cz +d = 0 of the plane 
through the point Po orthogonal to A if 


(a) Po = (1, 3, —2), A= (1, 2, 7] 
(b) Po = (4, 0, —1), A = (2, 2, —1] 
(c) Po = (2,1, 0), A = [0, 0, 1) 


(d) Po = (1, l, 1), A = [1, 0, 0] 


. Find the equation, in standard form, for the plane through Po orthogonal 
to A for each of the following. Find the point at which the resulting plane 
cuts the z-axis. 


(a) Po = (1, 2, 3), A= [5, l, 1] 


(b) Po = (0, l, 0), A = [, 1, —1) 
(c) Po S (5, —ő5, 0), A= (1, 1, 2] 
(d) Po = (1, 10,1), A = [—5, 1, —4] 


. For each of the following equations, give a point Po and a vector A such that 
the equation is that of a plane through Po orthogonal to A: 


(a) 3a —2y+2+5=0 
(b) t+y+1=0 

(c) 2e-+ y+ 44 = 0 

(d) —z — 2y — 32 +6 = 0 


. By assuming an equation of the form az + by + cz-+ d = 0, and treating 
a, b, c, and d as unknowns, find the equation of the plane containing the 
three points (1,0, 2), (2,1,1), and (—1, —3, 3). How is it that we can 
solve for four unknowns, with only three equations being given by these 
three requirements? 


. Let M be a plane through the origin. A vector X = [z1, x2, x3] is in the 
plane M if and only if the point X = (z1, £2, z3) is in the plane. Let Po 
be any point in space and set N = {P| PoP € M}. What is N? Prove 
your answer. 


. Let B = [bı, b2, b3] and C = [c1, c2, c3] be two nonzero vectors which are 
not collinear. Suppose the vector A = [aı, a2, a3] is orthogonal to both 
B and C (and A = 0). Prove that for any real s and t, sB + tC lies in the 
plane through the origin orthogonal to A. 


. Verify the details of the proof of Theorem 4-3 by showing that the points 
given by (4-6) and (4-7) are on the plane and that (4-8) follows. 


. At what points does the plane 
ax + by +cz+d=0 


cut the three coordinate axes? Use these formulas to make a sketch showing 
the location of the planes for each part of Problem 3. Draw the triangular 
section of the plane determined by these three points. What happens in (b)? 
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10. (a) What is the equation of the sphere with center (1, 0, 12) which passes 
through the point (3, 5, —2)? 
(b) Find the equation of the plane through the point (3, 5, —2) which is 
tangent to the sphere of part (a). [Hint: What line segment through 
this point would be orthogonal to the plane?] 


4-2 THE CROSS PRODUCT 


A number of problems in the preceding sections could be reduced to 
finding a vector orthogonal to two given vectors. In particular, Problem 5 
of the last section is simplified if we could find such a vector. 

Suppose two vectors A = [a;, a2, a3] and B = [b,, be, b3] are given, 
and we wish to find a vector X = [z, y, z] simultaneously orthogonal to 
both, that is, to find an X satisfying A- X = 0 and B- X = 0, or 


a,x + aoy + a3z = 0, 
b,x + boy + b3z = 0. 


We wish to solve this pair of equations for the three unknowns zx, y, and z. 
The solution obviously cannot be unique (think geometrically; if any 
nonzero vector X satisfies the requirement, then so does any vector col- 
linear with X), but we will be satisfied to obtain any nonzero vector solu- 
tion. 

In trying to solve this pair of equations we must be cautious. We cannot 
divide through by any of the a; or b;, since some may be zero. Therefore, 
if we try to eliminate one of the unknowns, say z, we must do so by multi- 
plying the first equation by b3 and the second by a3 and subtracting. This 
gives us 

(a;b3 — agb,)x + (deb3 — agbe)y = 0. 


This equation can be satisfied (we cannot divide!) if we set 


T (az2b3 — azb), 
y = —(a,b3 — agb,). 


We can put these two values into our original equations. After some 
simplification, this gives 


A3Q2b, =: a3zđıb2 + a32 = 0, 
b3aob1 =" bzaıb2 + b3z = 0. 


It can be seen that these are both satisfied if 


Zz = abo — Qob}. 
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These three values thus yield us a somewhat arbitrary solution to our 
problem. We give this solution a special name. 


Definition 4-4. Given two vectors A = [a), dg, a3] and B = [by, b2, b3], 
the cross product of these two vectors is defined to be the vector 


A X B = [(a2b3 — agbe), —(a1b3 — agbi), (aibe — a@eb;)]. (4-10) 


A useful mnemonic which is helpful in remembering the form of the 
cross product is the representation as a formal determinant: 


e; Cg &3 
A x B = Qa; a2 3l’ 
by bo b3 


Expansion of this determinant by the cofactors of the top row gives a 
representation of A x B. 
Thus, for example, to find [—1, 3, 2] X [5, 1, —1] we write 




















C1 C2 e3 
=l 3 2| =e l3 2; |= 2) ice te 
5 1 —1 1 —1 sil S =] 3) 5 1 
= —5e, + 9eo — 16e3 
= [—5, 9, —16]. 


After a bit of practice the reader should find it possible to write down the 
final vector directly after writing the second vector below the first: 


f=, 3, 2], 
[ 5, 1; —1], 


and visualizing the appropriate cofactors (don’t forget the negative sign 
on the second cofactor). When this is done, the student should always 
check that the dot product of the original vectors and this result is zero. 

Geometrically, it is evident that there are exactly two vectors of a given 
magnitude orthogonal to a given pair of noncollinear vectors, A and B. 
If A and B are thought of as directed line segments from the same point, 
then they determine a plane through that point and there are two vectors 
of magnitude 1, say, orthogonal to that plane, one pointing to each side 
of the plane. 

Suppose A and B, contained in a plane M, are not collinear, and are 
represented as directed line segments from a common point. If we measure 
the angle from A to B counterclockwise in the plane M, we will get an angle 
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less than 7 looking at the plane from one side, and an angle greater than m 
looking at it from the other side. The vector A X B points toward the side 
of the plane from which this angle 1s less than 7. 

Another way of expressing this result is in terms of the right-hand rule. 
If the right hand is held as in Fig. 3—2(e), with the thumb pointing in the 
direction of the vector A and the first finger pointing in the direction of B 
(or as close to the direction of B as the joints will allow), then the middle 
finger will point in the direction of A x B. 

The proof of this would be tedious, and will not be given here. The 
student may use this fact without question, however. 

In (4-10) we have defined a unique vector obtained from two given 
vectors, that is, an algebraic product of a new type. Again, we are in- 
terested in the algebraic laws that are satisfied. Here we find some sur- 
prises. First, let us think of the commutative law. Interchanging A and B 
in the above definition amounts to interchanging the small a’s and small b’s. 
Note that if this is done, the two terms in each component of the definition 
of A X B are interchanged. But these two terms have opposite signs. 
Thus the commutative law does not hold in general. In fact, we actually 
have an anticommutative law: 


A x B = —(B x A). 


Note how this result is connected with the physical picture described 
above. Interchanging the roles of the vectors reverses the direction of the 
cross product. 

The next general property to investigate is the associative law. Here 
again we find that the law fails in general. For, setting as usual e; = 
[1,0,0], e2 = [0, 1,0], and eg = [0,0, 1], we find e; X ez = e3, ez X 
e3 = @,, and ez X e = 0. Hence 


e2 X (e2 X e3) = e2 X e€] = —ez, 
(e2 X e2) X e3 = 0 X e3 = 0. 


The failure of these laws to hold means that care must be exercised in 
algebraic manipulations involving the cross product. 

The associative and commutative laws do hold when we mix the scalar 
multiple and the cross product. This property, which is usually called 
the homogeneity property, is easily verified in the form (A x B) = 
(tA) X B = A x (tB). This follows from the observation that in the 
definition of A X B, each term in each component contains exactly one 
component of A as a factor, so if each component of A, say, is multiplied 
by t, a single factor of t appears in each component of A x B. Similarly, 
each term of each component of A X B contains exactly one component 
of B as a factor, and hence A x (tB) = (A x B). 
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We will postpone discussion of the behavior of combinations of the dot 
and cross products and turn to the distributive law. Looking at the defini- 
tion of A X B, we see that each component is linear in the components of 
B. Hence A X (B+C)=AXB+AXC. Similarly, we could show 
that the distributive law holds in the other order, or we can use the anti- 
commutative law to obtain the same result. 

The above analysis can be summarized in the following theorem: 


Theorem 4-5. The cross product is a binary operation between vectors 
which is neither commutative nor associative, but which satisfies the 
following algebraic laws. 

(1) Anticommutativity: 


A xX B = —(B xA). 
(2) Bilinearity: 


(tA) X B = A x (tB) = (A x B), 
Ax(B+C)=AXB+AXC, 
(B+C)xA=BXA+CXA. 


The bilinearity property of the cross product is essentially a distributive 
property. It can also be extended to the cross product of sums of vectors. 
For example, 


(> a) x (XB) = S^ A; X B; 


j=1 i=l] j=l 


Similarly, the first part of the bilinearity property could also be combined 
with this same result (see Problem 11). 


Next we wish to examine the combinations of three vectors involving 
the cross and dot products. Given two vectors B and C, B x C is also a 
vector. Hence we can consider the dot product of this vector and a third 
vector A. This particular combination, A-(B X C), occurs frequently 
enough to deserve a special name. 


Definition 4-5. Given three vectors A, B, and C, the scalar triple product 
of these three vectors, in this order, is A- B x C. 


Note that in this definition, the parentheses have been left off B x C. 
This can be done since there is no way in which the result could be mis- 
understood. Since the cross product is defined only between two vectors, 
we could not take the dot product first. 
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If we let A = lai, ao, a3], B = [b;, bo, bs], and C = [c1, CQ, c3], then 
it is easily verified from the definitions that 


A-BxC= aıb2C3 + a2b3C1 -+ azbıC2 m asboc, = a1b3ce = Q2b1C3. 






































(4-11) 
This is easily seen by noting that 
e; Co ©3 
A-BxC=A- bi bo bs 
Ci C2 C3 
bo b b bi b 
= a -|e 2 03|) 1 eae 1 "| 
C2 C3 Cy C3 C1 C2 
bo b bi b bi b 
S 2 93) | 1 93 Foa 1 02 
C2 C3 Cy C3 Cı C2 
ai Aq a3 
= |b; be bs, (4-12) 
Ci Co C3 


which is itself a useful result. 

We can obtain the expansion of C- A x B by replacing each a by c, 
each b by a, and each c by b in (4-11). It is easily seen that if we do so, 
this expression is unaltered. The three terms with the positive sign inter- 
change cyclically, and so do the three with the negative sign. In other 
words, 

A-BxC=C-A XB. 


This can also be put in the form 
A-BxC=AXB-C, 


which is easily remembered as the fact that the dot and cross can be inter- 
changed without altering the scalar triple product. 

As a consequence of this and the anticommutative law, B X C = 
—C XB, we can evaluate any of the six possible scalar triple products by 
using A, B, and C in various orders. Another simple consequence is that 
A-B xC is zero if any two of the three vectors are identical. This 
follows since the cross product of any vector and itself is zero. If two 
vectors are collinear, the same result must follow, since a scalar multiple 
can be factored out of the product. We therefore have: 
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Theorem 4-6. The scalar triple product is unchanged upon interchange 
of the dot and cross product, 


AXB-C=A-BXC., 


The value of the scalar triple product is zero if any two of the three 
vectors are collinear. 


By the way in which it was constructed, the cross product of two vectors 
is orthogonal to both the original vectors. But what if the two vectors are 
collinear? There is no single direction orthogonal to both of them. What 
happens to the cross product? The answer is contained in 


Theorem 4-7. A x B = 0 if and only if A and B are collinear. 


Proof: Suppose that A x B = 0. Now, if A = 0, then A and B are 
automatically collinear; so suppose that A ~ 0. Then at least one com- 
ponent of A is nonzero. Let us suppose that a, = 0 (the proof would be 
similar if one of the other components were taken to be nonzero). 

Let t = b,/ay. Then 

bi = ta}. 


The fact that A X B = 0 means that all three components are zero, and 
hence 

ab3 — azb2 = 0, 

aıb3 = a3b = 0, 
and 

aıb2 = abı = 0. 


From the last of these, we have 


b 
bo = a, 2 = (do. 


From the second of the relations, we have 


bı 
bg = a, a3 = taz. 


Thus we have proved 


B = [bi, bo, bs] = (tay, tao, tas] , 
tA, 


which proves the first half of the theorem. 
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The proof of the other half of the theorem will be left to the student 
(see Problem 2 at the end of this section). 

Let us illustrate the use of the cross product in the solution of a prob- 
lem. We wish to find the equation of the plane passing through the points 
A = (1,0,2), B= (2,2, —1), and C = (1,1,0). This plane must be 
orthogonal to a vector which is in turn orthogonal to both of the vectors 
AC = (0, 1, —2] and BC = {—1, —1,1]. Thus, a vector orthogonal to 
the plane is 

AC x BC = [—1,2, 1). 


Therefore the equation of the desired plane is 


[—1, 2, 1] - [z, Y, z] Ba (=12, 1] - [1, 1, 0] = 0, 
or 
—qz + 2y +z — 1 =0. 


PROBLEMS 


1. Find A X B, given that 
(a) A (1, 3, 1], B = [7, 2, —1] 
(b) A [0, 1, 1], B = [1,0, 1] 
(c) A = (1, —2, 3], B = [—3, 2, —]1] 
(d) A = [5, 2, =H; B = [4, —7, 2] 
(e) A = (1, 1, 1), B = [—1, =l, 1] 


2. Prove that A X B = 0, given that A and B are collinear. 


3. By actually computing the vectors involved, prove that (A X B) = 
(tA) X B = A xX (tB). 


4. By actually computing the vectors involved, prove that A X (B + C) = 
AXB+AXC. 


5. Prepare a multiplication table for the cross products of the vectors e1, e2, 
and e3. 


6. There are twelve possible scalar triple products which can be written down 
using A, B, and C. Write down each of these and, using the two operations 
discussed at the end of this section, obtain its value in terms of A -B X C. 


7. Find the equations of the planes containing the three given points: 
(a) (1, Se, 5), (0, =); —l), (—3, 5, 0) (b) (5, 0, 1), (2, 3, 1), (0, =); 3) 
(c) (2, ð, 3), (0, l, 1), (1, 3, 0) (d) (1, 4, 1), (0, =Z; 0), (—2, 2, —2) 


8. Find the equation of the plane containing the points (1,0,1) and (3, 1, 2) 
and parallel to B = [1, —1, 2]. 


9. Find a nonzero vector parallel to the plane 3r + y — z — 2 = 0 and 
orthogonal to B = [1, 0, 2]. 
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10. Find a nonzero vector simultaneously parallel to both of the planes 
a+ 2y—2z+1 = Oand 2a —y+24+9 = 


11. Verify that 


(© ae: x (x bies) = 3 5 aibjei X e;. 


{=l / fal al fal 


4-3 DISTANCE FORMULAS 


Let A and Q be given points and let B be a given nonzero vector. We 
wish to find the distance from Q to the plane through A orthogonal to B. 
Our knowledge of euclidean geometry tells us that this distance would be 
the distance between the point Q and the point P on the plane which is 
such that the vector QP is collinear with B (and hence orthogonal to the 
plane). This vector is the projection of QA on the 
line with the direction B. This gives us, from 
Definition 3-16, 


-3 _ @A-B) 
ee == A 


and the required distance is (Fig. 4-1) 








~ [QA -B 
|QP| = [BI =a FIGURE 4-1 


We should, however, verify this solution, using the algebraic properties 
we have assumed. First, let us see that the point P is indeed on the plane. 
From the definition of a plane, this is true if and only if AP-B= 
(QP — QA)-B =O. But using the value given above for QP, we have 


QP — QA) -B = “pp EB) — (QA - B) 


= (QA - B) — (QA - B) 
S. 


Next, we verify that the point P is the closest point of the plane to Q. 
Suppose X is any other point on the plane. Since this plane is also the 
plane through P orthogonal to B, we must have PX orthogonal to B. 
But then PX must also be orthogonal to QP, since QP and B are collinear. 
We must then have 


QX|? = |QP + PX|? = |QP|? + |PX|? 
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from Theorem 3-8. From this, we can conclude that QX | is always greater 
than or equal to |QP|, with equality holding only when X = P. We 
have thus proved 


Theorem 4-8. Given points A and Q and a nonzero vector B, the point 
on the plane through A orthogonal to B which is closest to Q is P, where 


QA-B 
P = SB +Q, (4-13) 
and the distance from Q to the plane is 
—+ |QA-B 
jor] = Fl. (4-14) 


Rather than trying to memorize these formulas, many students find it 
easier to remember only the formula for projections, and redevelop (4-13) 
or (4-14) as needed, by thinking of Fig. 4-1. Note how (4-14) is related 
to the formula given in Definition 3-16 for the cosine of the angle between 
two vectors. If 6 is the angle between QA and QP (or equivalently between 
QA and B), then from Fig. 4-1 we would have 


IQP] = [QA] |cos 4]. 
But from Definition 3-16, 


An immediate consequence of these two equations is (4-14). 

Suppose the given plane has the equation ax + by + cz + d = 0, and we 
wish to find the distance from this plane to the point Q = (z1, Y1, 21). 
To use the above formula, we need a point of the plane. Let A = 
(£o, Yo, Z0) be such a point. Then we must have azo + byo + czo = —d. 
The formula in Theorem 4-8 then gives the distance from Q to the plane as 


[xo — 21, Yo — Y1, Zo — #21]: [a, 6, c]| 
(a2 + b2 + ¢2)1/2 
_ [azo + byo + czo — (axı + byı + czı)| _ lazı + byı + czı + d|. 
(a? + b2 + c?)1/2 (a? + b2 + c2)1/2 


Theorem 4-9. The distance from the point (1, y;, 21) to the plane hav- 
ing equation ax + by + cz + d = 0 is 


Jax, + byı + czı + d| 


(a2 $ b? F c2)1/2 (4-15) 
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The reader should note the simplification of this formula when the 
normal vector [a, b, c] is taken to be of length one and also what the formula 
reduces to when the distance from the origin to the plane is calculated. 


Let us see an example of the use of the formulas developed above. Con- 
sider the plane M with equation 


oz — l4y+22+9=0 


and the point Q = (—2,15, —7). We wish to find the distance from Q 
to the plane and the point on the plane which is closest, to Q. 

It is easy to find the distance from Q to M by the use of formula (4-15). 
This formula gives us the distance 


|-10 — 210 — 14+ 9] 225 | 


ô= — a5 +196 +412 ~ 15 


15. 

We can use (4-13) to find the point P on M which is closest to Q. For 
this formula we need a point A on M, however. We choose one arbitrarily 
to satisfy the given equation. For example, we let A = (1,1,0). Then 
QA = [3, —14, 7], and hence 


p = B175 16 2l 5, —14, 2] + [—2, 15, —7] 
= sea [5, —14, 2] + [—2, 15, —7] 


= (3,1, —5]. 


A plane which is parallel to the z-axis has an equation which contains no 
z-term. This can also be thought of as the equation of a line in the zy- 
plane (the line formed by the intersection with the zy-plane). Considera- 
tion of the above formulas in these terms leads to the following conclusion: 


Theorem 4-10. The distance from a point (zı, yı) of the zy-plane to a 
line ax + by + d = Q is 


lax, + by: +d] 
(a? + b2)1/2 


Further discussion of this result can be found in the problems at the 
end of this section. 


One final formula which can be included at this time is for the determina- 
tion of the angle between two planes. When two planes intersect we may 
choose a point on the line of intersection and measure the angle formed 
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between two lines, one in each plane, from this point. It can easily be seen 
that no matter how the planes are situated, the resulting angle can fall 
anywhere between 0 and 7, inclusive. The two limiting cases result when 
both lines are chosen to be along the line of intersection. In order to 
specify the angle of intersection between two planes, we must decide which 
of these many possible angles to measure. Reference to geometric intuition 
tells us that we should choose to measure the angle between the two lines 
which are orthogonal to the line of intersection. 

It could actually be shown that this pair of lines has a stronger property. 
Indeed, if we fix a line in one plane and let the line in the other plane vary, 
we will find that we get many angles, but one of them will be a minimum. 
If we then let the first line vary, this minimum will change, but for some 
angle it will be a maximum. The maximum value of the minimum turns 
out to be the angle between the two lines orthogonal to the line of inter- 
section, exactly the angle we choose to call the angle between the two 
planes. 

Since we do not have these lines given to us, but we do know the vectors 
orthogonal to the two planes, we choose to define the angle between the 
planes as the angle between the orthogonal vectors. This corresponds to 
the familiar geometric fact that two angles whose sides are respectively 
orthogonal are equal. However, each plane has two distinct orthogonal 
vectors of a given length (one the negative of the other). There are there- 
fore two possible angles (between 0 and 7). They are related by the fact 
that their cosines are negatives of each other. Hence one of them is 
between 7/2 and 7, and the other is between 0 and 7/2. We will choose 
the smaller of the two angles to call the angle between the two planes. 


Definition 4-6. Let M, and Mz, be two planes and let B,; and Bz be 
nonzero vectors orthogonal to Mı and M» respectively. Then the cosine 
of the angle between Mı and M» is 


|B; - Bol 
|B,| |Bo| 


For example, the cosine of the angle between the planes with equations 


dx — 6y + 62 —5 = 0 
and 
6z + 9y — 22+ 3=0 
1s 
|18 — 54 — 12| _ 16 
9-11 — 833 
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. Find the distance from the given point to the given plane: 

(a) (1,3,1), 3z + Ty — 52+ 3 = 0 

(b) (2, 1, —5), 2r — y+ 2z +1 =0 

(c) (1,1,0), 3z + 4z +2 =0 

(d) (2, 0, 3), 6r + 2y + 3z = 0 

. In the distance formula of Theorem 4-10, the absolute value of the numerator 


is taken. What is the meaning of the sign of this quantity? Consider this 
question in connection with the direction of the normal vector [a, b, c]. 


. Let M be the plane with equation az + by + cz + d = 0. Let P be the 
point on M which is closest to Q = (z1, y1, 21). Show that 


az; + by1 + czi +d 


P = (x1, Yi, zı] a az Æ b2 $ c2 [a, b, c]. 


. For each part of Problem 1, find the point on the given plane which is closest 
to the given point. 


. If two numbers a and b are such that a? + b? = 1, then there exists an 
angle a such that a = cosa, b = sina. Show that the equation of any 
straight line in the plane can be brought into the form 

zcosa+ ysina-+ p = 0. 


For an equation in this form, what is the geometric meaning of æ and p? 
[Hint: Use Theorem 4-10.] This is called the normal form of the equation of 
a line. 


. Find the cosine of the angle between the following pairs of planes: 
(a) 3r — 2y + z — 5 = 0, 2r + 3y —z +1 =0 

(b) 7x —z+1=0,7+y—1=0 

(c) y+z2=0,%7—y=0 

(d)x—ytezt+1 =0, 2x — 2y+2z2—3=0 

(e) 3r + 5y — 2z — 5 = 0, 2r + 2y + 82+7 = 0 


. What does formula (4-13) become if |B| = 1? For an arbitrary B, set 
e = B/|B| and state (4-13) and (4-14) in terms of e. 


. What is the equation of the sphere with center (3, —9, —15) which is tangent 
to the plane 

4x — Ty — 42+ 27 = 0? 
[Hint: What is the radius of the required sphere?] 


. Find the distance from the point (3, 5, 3) to the plane passing through the 
three points (2, —5, —1), (—3,1,1), and (0, 2,9) by first finding the 
equation of the plane. 
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10. Let A, B, C, and Q be four noncollinear points in space and let AB = R, 
AC = S, and AQ = T. Prove that the distance from Q to the plane passing 
through A, B, and C is 

IT-RX S|. 
IRX S| 


11. Let A, and Ag be two distinct points and let 4; 42 = B. Let M; and Mo 
be the planes through Ai and Ag respectively, each orthogonal to B. Let 
Q be any point in M2. Prove that the distance from Q to M is |B}. 


Remark: This proves that two distinct planes orthogonal to the same vector 
have no points in common (i.e. they are parallel). 


4-4 THE STRAIGHT LINE 


We now turn to a discussion of straight lines. Definition 3-8 can be 
rewritten in the following way: 


Definition 4-7. Given a point Po and a nonzero vector A the straight 
line determined by this point and this vector is 


L = {X | PoX = tA, t any real number}. 


If we use the representation of a point X as a vector X, this definition 
can be written in the form 
= Po + tA; (4-17) 


the same definition in terms of the components would give 


xr = To + ta, 
yY = Yo + tb, (4-18) 
z = Zo + tc, 


where Po = (£0, Yo, Zo) and A = [a,b,c]. These three forms are com- 
pletely equivalent, and any one of them can be used as the situation 
requires. 

What is defined here is a set of points constituting the straight line. 
However, there is a natural direction associated with this line, namely the 
direction on the line induced by the parameter t. The line can actually be 
thought of as a coordinate line with t as the coordinate. We will use the 
notion of a direction on this line, induced by the parameter t, without 
further comment whenever needed. 
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Any of the three forms given above for the determination of the co- 
ordinates of a point on the line is called the parametric form of the equation 
of a line. Any desired information about the line can be found from its 
parametric equation. In many textbooks on analytic geometry, the 
standard form of a line is given by the symmetric equations. These are of 
limited utility, however, since for many lines special conventions must 
be introduced to give the symmetric equations a meaning. (A discussion 
of the symmetric equations will not be given here, but will be found in the 
problems at the end of this section.) For short, we will call (4-17) an 
equation of the line, and (4-18) equations of the line. 

What are equations of the line through Py) = (1, 1, 0) with the direction 
A = [1, —2,1]? From (4-17) we have in this case the equation 


X = [z, Y, 2] r (1, 1, 0] T tll, =; 1], 
or equivalently, 
X= [14461 — 2t 4]. 


This last form can also be thought of as 
(x, Y, z) T (1 SN t, t= 2t, t), 


which is a condensed method for writing the three equations of the form 
(4-18). 

Is the point (5, —7, 5) on this line? No, since the only possible way 
of getting the first coordinate of the point on the line to be equal to 5 is 
to put t = 4. This gives us the point (5, —7, 4) rather than the point 
(5,—7, 5). 

At what point does this line cross the plane y = 3? To have y = 3, 
we see that we must have 1 — 2¢ = 3, ort = —1. This then gives the 
point (0, 3, —1). 

At what point does the line cross the plane whose equation is 


5x + 6y +2+1 = 0? 


Substituting the coordinates of a general point into this equation, we 
find that we must have 


11 +e +611 — 2) +t+1=0, 
or 
12 — 6 = 0. 


This is satisfied by ¢ = 2, giving the point (3, —3, 2). 
The first question we raise is about the uniqueness of a straight line. 


The definition is given in terms of a point on a line and a vector which 
defines the direction of the line. We are interested in showing that the 
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line remains the same if we substitute another point on the line and another 
vector collinear with the given vector. 


Theorem 4-11. Let a point P, be on the line L determined by the point 
Po and the vector A. Let B ~ 0 be collinear with A. Then the line 
determined by P, and B is identical to L. 


Proof: We have ' L = {X |X = Po + tA}. Set M = {X |X = P, + 
sB}. Here we use a different parameter, s, to avoid confusion in dis- 
cussing the points in these two sets. We are given that P, is on the line 
L; hence there exists a to such that 


P, = Po + loA. 


Likewise, since B is collinear with A, there exists a nonzero constant k 
such that B = kA. (Why must k = 0?) 
Suppose X € M. Then there is a real s such that 


X = P; + sB 
= (Po + toA) + skA 
= Po + (to + sk)A 
= Po + tA, 


where we set t = tọ + sk. This shows that if X €e M, then X E L. 
Conversely, if X € L, then we have 


X= P, + tA 
= Po + toA + (t — to)A 
= P, + sB, 


where s = (t — to)/k. This then completes the proof and shows that 
L = M. 


A line is determined by a point and a vector, but there are also many 
other conditions which determine a line. This gives rise to problems of 
how to obtain the parametric equations of a line'when it is determined by 
conditions other than the standard ones. 

As a first example, we observe that two points determine a line, and that 
if we are given two points P; = (2, y1, Z1) and P2 = (£2, Y2, Z2), we can 
easily find the line determined by these two points. Either of the two 
points will serve as a point on the line, and so we only need to find a vector 
with the required direction. Clearly such a vector is P,P, = P, — P}. 


4-4 THE STRAIGHT LINE 141 


Therefore, a parametric equation of the line determined by the points 
P, and P2 1s 

X = P, + t(P2 — P)), 
or 

X = (1 — P, + tP. (4-19) 


This is the form we used to define directed line segments in an earlier 
section and could have been used in this section. 

As an example of this, let us find an equation of the line through the 
points (1, 0, 2) and (3, 1, 5). Here, a vector giving the direction of the line 
is [3, 1, 5] — [1, 0, 2] = [2, 1, 3], and so a vector parametric equation 
for the line is 

X= (1, 0, 2] + t[2, 1, 3]. 


A second way in which a line can be determined is by a point and the 
requirement that it be orthogonal to two given vectors. If the two given 
vectors are A and B, then clearly the vector A X B is the desired vector 
determining the direction of the line. Thus, the line through a point Po 
orthogonal to the vectors A and B is given by the equation 


X = P, + (A x B). (4-20) 


If we are given two nonparallel planes, they detefmine a line, their line 
of intersection. Finding a vector giving the direction of this line is easy. 
It is merely the cross product of the two orthogonal vectors of the planes. 
(Why?) The only problem is to find a point on the line. This can be done 
by taking the equations of the two planes, eliminating one of the unknowns 
and assigning any convenient value to one of the remaining unknowns. 
This procedure can be illustrated best by an example. Suppose we are asked 
for an equation of the line of intersection of the two planes with equations 


3z — y + 2z — 7 = 0, 
z+t+y— 5e+5= 0. 


The orthogonal vectors are (3, —1, 2] and [1, 1, —5]. A direction vector 
for the line is therefore [3, —1, 2] < [1, 1, —5] = [8, 17, 4]. Eliminating 
y between the equations gives 


4x — 32 —2 = 0. 


Setting z = 2 in this equation gives 4x = 8, or x = 2. Putting these 
values back into the second equation yields 2+ y — 10+ 5 = 0, or 
y = 3. Soa point on the line is (2, 3, 2). Hence a parametric equation of 
the desired line is 

[x, y, 2] = [2, 3, 2] + t3, 17, 4]. 
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Other conditions could be given which would serve to determine a line, 
but those listed above are the main ones which appear in practice. Other 
sets of conditions appear only rarely. 

When two equations represent the same plane, it is easy to recognize 
the fact since this can happen only when one equation is a nonzero multiple 
of the other. The situation is not so simple in the case of the parametric 
equations of lines. For example, two of the lines 


L,: X = [1, —5, 3] + 6,8, —4], 

Lo: X = [6,0, 5] + ¢{—3, 4, 2], 

L3: X = (7, —3, —1] + 3, —4, 2], 
La: X = [4, —7, 1] + #9, 12, —6], 


are actually identical. But which two? 

In order to be identical, it is first necessary that two lines have collinear 
direction vectors. On this basis we can eliminate line Lz from consideration. 
It is not parallel to any of the others. The remaining three lines are all 
parallel, however. Two will be identical if and only if they have a point 
in common. We check this by seeing whether the “initial point” of one 
line is on another. 

We first check whether (7, —3, —1) is on Lı. To have x = 7 in Lı, 
we must have t = 1. This gives the point (7, 3, —1), and we conclude that 
this point is not on the line. 

Next we see whether (4, —7, 1) ison Lı. To have x = 4 we must have 
t = 4, giving the point (4, —1, 1) on Lı. Therefore Lı and L4 cannot be 
identical. 

Finally, we verify that the remaining two lines, L3 and L4, are identical 
by noting that ¢ = 1 in Lz gives the point (4, —7, 1), which is on the line 
L4. 


PROBLEMS 


1. Give parametric equations of the lines joining the pairs of points listed below. 


(a) (1, 2, 7) and (—3, l, 1) (b) (3, l, 0) and (5, —2, 7) 
(c) (11, 12, 13) and (2, 1, —1) (d) (1,1, 1) and (3, 1, —1) 
(e) (0, I, 2) and (0, l, 3) (f) (1, =S 1) and (0, =S 0) 


2. Find parametric equations of the lines of intersection of the pairs of planes 
listed. 
(a) 3z — 2y + z — 5 = 0, 2r + 3y —z +1 =0 
(b) 7e —z+1=0,7+y-—-12=0 
(c) yr>2z2=0,4—y=0 
(d) zr — y+z+1 = 0, 2r — 2y+z— 3 = 0 
(e) 3r + 5y — 2z — 5 = 0, 2r + 2y + 8z +47 =0 
(ff) c+>1=0,2+ty1z=0 
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3. For each (or a selected number) of the lines in Problem 1 (and/or 2), find 
the value of the parameter and the coordinates of the points at which the 
line cuts the three coordinate planes (that is, the planes with equations 
x=0,y = 0, andz = 0). 


4. What is an equation of the line joining the points (11, y1, 0) and (z2, 0, z2)? 
At what point does this line cut the plane z = 0? 


5. The line of intersection of the plane through Po orthogonal to A and the 
plane through Po orthogonal to B is given by X = Po + ¢(A X B) according 
to the result of the text. (Assume A and B are not collinear.) Prove that 
every point on this line is common to both planes. 


6. For the line X = Po + tA, show that the parameter t is, in general, deter- 
mined by any of the three coordinates of a point on the line in the form: 


_ (x — 20) 
a a 


(y — yo) 
b 


_ (z — zo) 
7 Cc 


t 


The three expressions on the right-hand side can be set ‘equal in pairs, giving 
the equations of three planes. Identify these planes (a sketch will help). 


The form 
z — To _ Y—Yo_2—2 


a b c 


is called the symmetric form of the equations of the line. Under what con- 
ditions can this form be considered valid? 


7. For each of the following, find the value of the parameter and the coordinates 
of the point at which the given line cuts the given plane. 
(b) X = [0, 1, —1] + i, 5, —2]; 7a —ytz2+2=0 
(c) X = [5,8,1] + [1,0,8]; 22 —2y—z—5=0 
(d) X = [3,0,5] + ¢1,1,—1]; 5r+z=0 


8. At what points does the line 


X = [1,3, —12] + t[1, 0, 5] 
cut the sphere 
(z — 6)? + (y + 1)? + 2? = 81? 


9. Find equations for the line through (1, —1, 2) which is orthogonal to the 
plane 3a — 2y + z — 5 = 0. 


10. Find equations for the line through (—1, 5,0), orthogonal to the line 
X = [1,1,2] + ¢{[—1, 3, 0], and parallel to the plane z + y — 4z + 2 = 0. 
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11. If the given pair of lines intersect, find the point of intersection. 
(b) X = [1,2,0] + t[5, 0,7]; X = [0,0,8] + t[—8, 4, 2] 
(c) X = [3, 4, 5] ++ (1, =I, —2]; X = [9, —8, 3] + t[—1, 4, —3] 
(d) X = [5, l, —5] F t[2, 1, 5]; X = (1, ails —14] se til, 3, =l] 


12. For parts (a) and (b) of Problem 11, find equations of the lines orthogonal 
to the given pair of lines and passing through the point of intersection. 


13. The angle between a line and a plane is defined to be (1/2) — ¢, where ¢ 
is the angle between the line and the vector orthogonal to the plane (choosing 
the angle between 0 and 7/2). Find the cosines of the angles between the 
line and the plane in each of the four parts of Problem 7. 


D 


Vectors as 
Coordinate Systems 


5-1 SOME VECTOR IDENTITIES 


Let us consider the triple cross product A x (B x C). As was com- 
mented earlier, the parentheses are necessary here since the cross product 
is not associative. This combination clearly represents a vector orthogonal 
to A and to B X C. A vector orthogonal to B X C must lie in the plane 
(through the origin) determined by B and C and, as will be shown in the 
next section, must therefore be a linear combination of B and C. If we 
write A X (B X C) = uB + C, then 


A . [uB + vC] = u(A - B) + (A - C) 
= A- [A x (B x C)] 
= 0. 


This last follows since A-[A X (B X C)] is the scalar triple product of 
three vectors, two of which are identical. As a consequence of this calcula- 
tion, we can conclude that u = k(A-C) and v = —k(A-B) for some 
scalar k. An actual example shows that k = 1, but an attempt to prove 
that k is a constant independent of A, B, and C would be very difficult. 
The required argument is quite deep and involves the concept of con- 
tinuity. It would be very difficult to give a rigorous proof along these lines 
at this stage. 

The actual identity which we wish to prove is stated: in the following 
theorem. 


Theorem 5-1. For any vectors A, B, and C, 
A x (B x C) = (A-C)B — (A- BDC. (5-1) 


We will use this in obtaining all the other identities of this section and so 
would like to have a complete proof of it. A direct proof by calculating 
each side in terms of components is possible but extremely tedious. In- 
stead, we will offer two proofs, which, while still long, are shorter than 
direct computation. 
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First proof: Let A = aiei + a2€2 + azez, B = bie: + bD2€2 + b3e3, and 
C = ce; + Co@2 + c3e3. Then 


B x C = (bəċ3 — bgce)e, + (b3cy — bic3)€2 + (b1c2 — bec) es, 
and 
A x (B X C) = aie, X (B X C) + az2e2 X (B X C) + azez X (B x C) 
by the linearity of the cross product. Let us investigate the first term of 


this expansion. Using e; X eı = 0, €; X e2 = e3, €; X e3 = —evz, and 
the linearity of the cross product, we have 


aie X (B X C) = aı(b3cı — bıc3)e€3 — aı(bıc2 — b2c1ı)e2 
= QıCıb2€2 + a1C1b3€3 — Q1b1C2€2 — aıbıC3€3 
= G1C1b1e; + Q1C1b2€2 + a1C1b3€3 — aıbıc1e1 — 
a1b1C2€2 — Q1b1C3€3 
= aıcıB — aıbıC. 


Here, in the next to last step, we added and subtracted the term a,b,c,e}. 
In the same way we could show that 


As€o X (B x C) = a2C2B BE a2b2C, 
azes X (B X C) = a3c3B — abC. 


Adding these three results together would give (5-1). 


Second proof: Let us first look at the unit vectors. Since the cross product 
of two collinear vectors is zero, 


e; X (e; X ex) = 0 i j= k. 


On the other hand, if 7 ¥ k, then e; X e, is collinear with the third unit 
vector, and hence 


e; X (e; X e,) = 0 if 7 #~k and i#j or k. 
With the help of the right-hand rule it is easy to verify that 
e; X (e; X e) = —ek if ~=j and j #k, 


= è; if i=k and j Xk. 
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e; X (ej X e) = (e;: ex,)e; — (e;- e;)ex, 


that is, (5-1) holds if A, B, and C are the unit vectors. 
To save writing, we will make use of the summation notation and write 


3 3 3 
A= > a,C;, B = 5 b;jej, C= D Chex. 
i=l j=1 k=1 
Then from the linearity of the cross product, we have 
3 3 
= > 5 bD;Ckej X ek 
j=1 k=1 
and 


3 
A x (Bx C) = 2 ais xX (e; X e). 


i Ms 
iM- 


The multiple summation symbols indicate that we are to take the sums 
over all three indices, giving a total of 27 terms. But then 


A x (B x C) -Š 2 


j=l 


p AOLE;  €,)e; 


3 3 3 
= 5 > > a,b ;c,(e; . e;)ez. 
i=1 j=1 k=1 


In the first of these summations e;-e, = 0 except when 7 = k; hence 
for any fixed 7, the nine terms with that value of ïj reduce to three, and are 
in fact 


iM 


(A ‘ C)b;e j 
Summing these over j gives (A-C)B. Similarly, the second summation 
is identified as —(A - B)C, and the proof of (5-1) has again been obtained. 


From (5-1) and the anticommutative property of the cross product, we 
can obtain the similar identity: 


Theorem 5-2. For any vectors A, B, and C, 
(A xX B) x C = (A -C)B — (B-C)A. (5-2) 


Also, from (5-1) and (5-2) we can prove the following theorem. 
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Theorem 5-3. For any vectors A, B, C, and D, 


(A <x B) x (CX D) = (A-BX D)C — (A-BXC)D,_— (5-3) 
and 
(A xX B) x (C x D) = (A-C x D)B — (B -C x D)A. (5—4) 


Proof: We prove (5-3) by considering (A X B) on the left-hand side as 
a single vector and using (5-1) on this combination. The properties of the 
scalar triple product then suffice to complete the proof. 

The proof of (5—4) is found similarly by using (5-2) considering (C Xx D), 
as a single vector. 

The right-hand sides of (5-3) and (5—4) represent the same vector. 
Setting these two expressions equal and doing some rearranging, we find 


Theorem 5-4. For any vectors A, B, C, and D, 


(B-C x DJA — (A-CxD)B+(A-BxD)C—(A-BxCD=0. 
(5-5) 

Note that the scalar triple products which appear in this result contain 
the three vectors not being multiplied and that the three vectors appear 


in their natural order in every case. 
Next, we prove an extremely important pair of relations: 


Theorem 5-5. For any vectors A, B, C, and D, 


(A x B)-(C X D) = (A-C)(B-D) — (B-C)(A-D), (5-6) 
and 
|A X Bi? = |A|?|B/? — (A - B)’. (5-7) 


Formula (5-7) is known as Lagrange’s Identity and (5-6) is usually called 
the Extended Lagrange Identity. 


Proof: We prove (5-6) by considering the left-hand member as a scalar 
triple product and using (5-1) in the following manner: 


(A x B)-(C X D) = A: [Bx (Cx D)] 
— A-|[(B-D)C — (B-C)D] 
— (A-C)(B-D) — (B - C)(A - D). 


Formula (5-7) is a special case of (5-6), obtained by setting C = A 
and D = B. 

Lagrange’s identity Eq. (5-7) has a special geometric significance. If 
we let 0 be the angle between the vectors A and B, we have A-B = 
|A||B| cos @. 
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If we put this into (5-7), we have 


IA x Bi? = JA|?IB]? — |A|?IB|2 cos? 0 
= |A|*|B|?[1 — cos? 6] 
= |A|?|B|? sin? 0. 


Taking the square root of this expression shows that 
|A X B| = |A||B] sin 8, (5-8) 


a formula which is similar to the relation found for the dot product. We 
of course take sin 6 as nonnegative in this formula. This is equivalent to 
using only angles between 0 and r. 

The importance of formula (5-8) is indicated by the fact that many 
texts on vector analysis have used this equation to define the cross product. 
That is, A X B would be defined as a vector orthogonal to both A and B, 
with a direction as given by the right-hand rule, and with a magnitude 
given by (5-8). Such a definition is, however, very difficult to work with. 
Proving such a fundamental fact as the linearity of the cross product would 
already cause trouble. 


> 





FicgurE 5-1 FIGURE 5-2 


Consider now the parallelogram determined by the pair of vectors A and 
B (Fig. 5-1). If 6 is the angle between the vectors A and B, then taking 
|B| as the length of the base of the parallelogram, we find that the length 
of the altitude is |A| sin 6. Hence the area of this parallelogram is |A||B| 
sin 6 = |A X B|. This can also be interpreted as saying that the vector 
A X B has a direction orthogonal to the plane of A and B and a magnitude 
equal to the area of the plane parallelogram with sides A and B. 

If we now add a third vector C, we see that 


(A X B)-C = |A X B||C| cos ¢, 


where ¢ is the angle between A X B and C. But |C| cos ¢ is exactly the 
length of the projection of C onto the line with direction A Xx B, and 
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hence is the altitude of the parallelepiped with sides A, B, and C (Fig. 5-2). 
That is, we see that the scalar triple product A X B - C has a magnitude 
equal to the volume of this parallelepiped. The sign of the scalar triple 
product is positive or negative as the triple of vectors forms a right-handed 
or left-handed set. That is, if the angle between A x B and C lies in 
the range 0 to 7/2, then A X B- C is positive. 


PROBLEMS 


jþ—i 


. Prove (5-2) from (5-1). 
. Prove (5-3) from (5-1). 
. Prove (5-4) from (5-2). 
. Let A = [1, 2, 5], B = [—1, —5, 2], C = [1, 3, 2], D = [1, 1, —1]. 
(a) Calculate directly the right-hand and left-hand members of (5-1). 
Which side is easier to calculate? 


(b) Do the same for (5-3). 
(c) Do the same for (5-6). 


5. Show that 


A Ww N 


|A X B|? + (A: B)? = |A]?|B]?, 
and prove: 
(a) AX B = Oif and only if |A- B] = |A| - |Bi, 
(b) A-B = 0 if and only if |A X B| = |A|- |B]. 
6. Calculate the area of the parallelogram determined by: 


(a) A and B in Problem 4 
(b) A and C in Problem 4 
(c) B and C in Problem 4 


7. Calculate the volume of the parallelepiped determined by A, B, and C of 
Problem 4. 


8. Show that if A, B, and C are three points in space, then the area of the 
triangle with these three points as vertices is 


4/AB X AC|. 


9. Show that if A, B, C, and D are four points in space, then the volume of 
the tetrahedron with these four points as vertices is 


HAB -AC x ADI. 


10. Let A = (1,3,7), B = (2,5,1), C = (1,1,5), and D = (—2, 3, 2). 
(a) Find the area of the triangle ABC. 


(b) Find the area of the triangle ABD. 
(c) Find the volume of the tetrahedron ABCD. 
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11. Let (aı, a2) and (bı, b2) be two points of the plane. Show that the area of 
the parallelogram which has vertices at the origin and these two points is 
the absolute value of 

a a2 


bi bal 








12. Find a formula for the volume of the parallelopiped determined by the four 
vertices, O (the origin), A, B, and C, in terms of the coordinates of A, B, 
and C. (See formula 4-12.) 


13. From the results of Problems 11 and 12, can you give a condition on the 
coordinates of the points so that 
(a) two points in the plane are on the same line through the origin, 
or 
(b) three points in space are on the same plane through the origin? 


14. Let A and B be two nonzero vectors. Discuss the problem of finding a 
vector X such that 
AXX =B. 


Find all solutions of this equation if any exist. [Hint: What is the direction 
of the cross product of two vectors? What if B is not orthogonal to A? If 
B is orthogonal to A, X must be orthogonal to B. Try writing X = U X B. 
What conditions must U satisfy ?] 


15. Show that there are five different ways in which parentheses can be intro- 
duced into the product A X B X A X B to make it well defined. Using the 
formulas of this section, simplify each product and show that four of them 
are always the same. 


16. Using the formulas of this section, simplify each of the following expressions. 
(a) A X (A X B) 
(b) AX (A X (A X B)) 
(c) AX (A X (A X (A X B))) 
(d) AX (AX (A X (A X (A X B)))) 
(e) AX (A X (A X (A X (A X (A X B))))) 
(£) What would the general formula be? 


17. Using (5-8), prove the law of sines for a triangle by vector methods. 


18. Find a condition on the vectors A, B, C, and D which will guarantee that the 
plane through the origin, A, and B will be orthogonal to the plane through 
the origin, C, and D. 


19. Prove each of the following: 
(a) (A X B)- (BX C) X (CX A) = (A-Bx C)? 
(b) (A X B) X (AX C) = (A-BX C)A 
(c) (((A X B) X A) X (A X B)): (A X B) X A) = 0 
(d) |A X (A X B)|? = |A/*/B|? — |A|?(A - B)? 
(e) AX (BX C)+BX (CK A)+CX (AX B) = 0 
(€) (AX B)-(CX D)+ (AX D)- (BX C) = (AX C): (BX D) 


152 VECTORS AS COORDINATE SYSTEMS 5-2 


20. Let Aı, A2, and A3 be three given vectors. Define Bı = A2 X A3, Bo = 
A3 X Aı, B3 = Aı X Ag. Prove that A;-B; = 0 for alli = j. Can you 
interpret this geometrically? 
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We said that two vectors are collinear if and only if one is a scalar 
multiple of the other. We have used several intuitive properties of collinear- 
ity in our discussions of previous sections. In this section we would like 
to organize and prove these properties more carefully. The first thing we 
are interested in is the connection between the cross and dot products and 
collinearity. This connection is given by the following theorem: 


Theorem 5-6. Two vectors A and B are collinear if and only if either 
A X B = Oor [A-B| = |A|[BI. 


Proof: Formula (5-7) of the last section shows that the two conditions 
of the theorem are equivalent (see also Problem 5 of that section). There- 
fore, we see that this result has already been proved in Theorem 4-7. 
We will, however, offer another proof here, which is somewhat simpler 
than the proof given in Theorem 4-7. 

Half of the theorem is immediately obvious, for if A and B are collinear, 
then there is some t such that A = tB, and 


A x B = tB x B = t0 = 0. 


The other half of the proof is similar to the proof of the Cauchy-Schwarz 
inequality. Suppose that |A-B| = |A||B|. Further, let us suppose that 
A ~ 0 (if A = 0, then A and B are trivially collinear). Set t = +|B|/|Al, 
choosing the same sign as (A- B) so that 


B 
(A+B) = |d/A-Bl = (Ff (AIIBI = |B}. 


Then, just as in the proof of the Cauchy-Schwarz inequality, we find 


tA — B|? = t?|A|? — 2t(A - B) + |B|? 
= |B|? — 2|B|? + |B|? = 0. 


Hence we can conclude that B = tA, thus proving the theorem. 

As a consequence of this result, we can prove the following theorem, 
which states a fact that we have already been using in our informal dis- 
cussion. It is, in effect, the converse of the statement that A X B is orthog- 
onal to both A and B. 
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Theorem 5-7. If C is orthogonal to both A and B, then C is collinear 
with A x B. 


Proof: If C is orthogonal to both A and B, then C - A = Oand C -B = 0. 
But then, using (5-1), we can compute 


C x (A xX B) = (C - B)A — (C - A)B 
= 0, 


and hence we can conclude from Theorem 5-6 that C is collinear with 
A X B. This result should be compared with Theorems 4-2 and 4-3. 


We now turn to a consideration of coplanar vectors. We shall call a 
collection of vectors coplanar if they are all parallel to the same plane. 
If we think of a set of vectors as being line segments drawn from the 
origin, then they are coplanar only if there exists a plane through the 
origin containing all of them. From our definition of a plane, this will 
occur only if there is some vector (the orthogonal vector to the plane) 
orthogonal to all vectors in the plane. Therefore, we use this as our formal 
definition. 


Definition 5-1. <A collection of vectors is called coplanar if and only if 
there exists a nonzero vector N orthogonal to all vectors in the collec- 
tion. 


Directly from this definition we can prove 


Theorem 5-8. Two vectors are always coplanar. Three vectors, A, B, 
and C, are coplanar if and only if A -B x C = 0. 


Proof: Let A and B be two given vectors. Suppose first that they are 
not collinear. Then A X B = 0 (from Theorem 5-6), and the vector 
N = A x B will be the required common orthogonal to A and B. 

On the other hand, if A and B are collinear and if either is nonzero, then 
any nonzero vector orthogonal to it will satisfy the requirements. The 
existence of such a vector is easy to show. If both A and B are zero, then 
any nonzero vector is orthogonal to both. 

The remaining part of the theorem requires two proofs, since it is an 
“if and only if” statement. For the first proof, let us suppose that A, B, 
and C are coplanar—that is, that there exists a nonzero vector N orthogonal 
to all three. We must then show that A-B x C = 0. However, since 
A-N = 0 and B-N = QO, we find from Theorem 5-7 that N is collinear 
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with A X B. Since N + 0, this means that there must exist a scalar ¢ 
such that A X B = tN. But then we calculate 


A-BxC=(AXB)-C 
— iN-C 
= 0O, 


because N was also orthogonal to C. 

To prove the last part of the theorem, let us suppose that A-B x C = 
0. If AX B #0, we can set N= AXB. ThenN-A=N-B=0Q, 
and N-C = AX B-C = QO, hence A, B, and C are coplanar. On the 
other hand, if A X B = 0, then A and B are collinear. If A = B = 0, 
then using the first part of the theorem, we see that A, B, and C are coplanar. 
If one of these, say A, is not zero, then B = tA, and again from the first 
part of the theorem, A and C are coplanar. That is, there exists a nonzero 
N such that N- A = N -C = 0. This also implies N : B = i(N- A) = 0, 
and hence we have proved the theorem. 

The last part of this proof is somewhat involved, because there are 
many separate cases which have to be considered. The reader may find it 
useful to diagram the proof, seeing how the various cases arise and how 
they are disposed of. The basic ideas used in the “main” cases are really 
the important ones. 


Theorem 5-9. Let A and B be two noncollinear vectors. Then a vector C 
is coplanar with A and B if and only if there exist scalars s and ¢ such 
that 

C = sA +4 tB. (5-9) 


Proof: Suppose that C = sA + /B. Then 


A-Bx C= (AXB)-C 
= (A X B). [sA + tB] 
s(A X B)-A + i(A x B)-B. 


But A xX B-A = A X B-B = 0, and hence we conclude from Theorem 
5-8 that A, B, and C are coplanar. 

The other half of the theorem is more difficult. Suppose that A, B, and 
C are coplanar. Then A-B x C = 0 (from Theorem 5-8). From the 
hypothesis that A and B are not collinear we have A X B +0. _ Let 
D = A x B. We now make use of formula (5-5) of the last section. This 
gives us 


(B-C x D)A — (A -C x D)B + (A-B x D)C — (A -B x C)D = 0. 
(5-10) 
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As we have seen, the coefficient of D in this expression is 0. The coefficient 
of C is 
(A-B x D) = (A X B-D) 
= (A x B): (A x B) 
= |A X B)’, 
which is nonzero by hypothesis. We can therefore solve (5-10) for C. 
Doing so and replacing D by A x B gives 


_ (BX C)-(AXB), | (AX C)- (AX B) 
[A X B|? |A x B|? 


This is the expression of the form (5-9) needed to prove the theorem. 


C= B. (5-11) 


The last result can be extended to give us the following important 
theorem: 


Theorem 5-10. If the vectors A, 
B, and C are not coplanar, then ee. ee 
every vector D can be written as 
a linear combination of A, B, and 


C. That is, for any D there exist 
scalars s, £, and u such that A 


nQ 


D = sA 4+ tB + uC. (5-12) FIGURE 5-3 


Proof: For any four vectors, formula (5-10) holds. By the hypothesis 
that A, B, and C are not coplanar and by Theorem 5-8, A-B x C # 0. 
Hence we can solve (5-10) for D, giving 


B-Cx D), (A-Cx D) 


(A-B x D) 
(A-B x ©) (A-B x ©) -Bxc) 


D = (A-B x ©) 


B + C. (5-13) 

The last theorem tells us that if we are given any three noncoplanar 
vectors, then every vector can be expressed as a linear combination of 
these. The geometric meaning of this statement is illustrated in Fig. 5-3. 
Here, all vectors are represented as directed line segments from the 
origin. Comparing this sketch with Fig. 3-1, we see that the three vectors 
A, B, and C can be thought of as determining an oblique system of co- 
ordinates. A point D in space can be determined by its A, B, C coordinates, 
which are (s, t, u) as given by (5-12). We will exploit this point of view 
further in the next section. 

Let us emphasize that the formulas and conclusions of this section have 
been obtained with the help of the cross product. In a later section we will 
see what can be done without having to use the cross product. 
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Formulas (5-11) and (5-13) are easy to use in practice. For example, 
suppose that 


A = [1, —3, 2), B = [1, —1, —1], and C = [—3, —7, 18). 


We find A x B = [5, 3, 2], and so A and B are not collinear. But A -B x 
C = 0; so the three vectors are coplanar. Computing the coefficients in 
(5-11) gives us 

C = 5A — 8B, 


as the reader may easily verify. 

Theorems 5-9 and 5-10 seem to have a great deal in common. Adding 
the remark that if A = 0 and A and B are collinear, then there is a scalar 
s such that B = sA, we see that the following three statements are related : 


(i) A single vector A is nonzero; 
(11) Two vectors A and B are noncollinear; 
(ii) ‘Three vectors A, B, and C are noncoplanar. 


There is a fundamental property which underlies these three statements. 
This property has been found to be most important in any attempt to 
extend the concept of vectors. It is called linear independence. 


Definition 5-2. A finite collection of vectors A;, Ao,..., An is called 
linearly dependent if and only if there exist scalars ^; (îi = 1,..., n), 
not all zero, such that 


AiAi + AzA F ee H AnAn = 0. (5-14) 


A collection of vectors which is not linearly dependent is called linearly 
independent. 


Note that in order to prove that a set of vectors is linearly independent, 
one must show that if (5-14) holds, then \; = Ag = --: = Mm = 0. 
The connection of linear independence with the concepts already discussed 
in this section is contained in the following theorem: 


Theorem 5-11. 


(1) A single vector A is linearly dependent if and only if A = 0. 

(2) Two vectors, A and B, are linearly dependent if and only if they 
are collinear. 

(3) Three vectors are linearly dependent if and only if they are 
coplanar. 

(4) Four vectors are always linearly dependent. 
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Proof: Statement (1) follows obviously from Definition 5-2. 
Statement (2) follows easily from the definition of collinearity. For if 
there exist A, and Az not both zero such that 


AA + àB = O, (5-15) 


then we can solve for one of the two vectors as a scalar multiple of the 
other, and hence vectors A and B are collinear. On the other hand, if 
A and B are collinear, then one is a scalar multiple of the other. Suppose, 
for example, that A = ¢B. This statement can then be rewritten in the 
form (5-15), 

A+ (—)B = 0. 


One of the coefficients is 1, and hence is nonzero. 

Statement (3) follows in the same manner with the help of Theorem 5-9. 

Similarly, statement (4) follows from Theorem 5—10, provided three of 
the four vectors are noncoplanar. If, however, some three of the vectors 
are coplanar, then these three are already linearly dependent. We can, 
therefore, write (5-14) with not all coefficients zero, using just these three 
vectors. To write such an expression involving all four vectors, just add 
the fourth vector with a coefficient of zero. This shows that the four 
vectors are linearly dependent in any case. 

Note that Theorem 5-10 can now be rewritten in the following form: 


Theorem 5-12. If A, B, and C are linearly independent, then any vector 
D can be written as a linear combination of A, B, and C. 


PROBLEMS 
1. Let n vectors, Ai, Ag, ..., An, be given. Let R be some nonzero vector. 
Prove that the vectors B1 = R X Ai, Bo = RX Ao,..., Bn = RX An 


are coplanar. 


2. Prove that if A is collinear with B ¥ 0 and B is collinear with C, then A 
is collinear with C. 


3. Prove that there always exists a nonzero vector orthogonal to a given 
vector. 


4. Use identity (5-5) on the final expression obtained for C in formula (5-11). 
How is this expression simplified if |A] = 1, |B] = 1, and A and B are 
orthogonal? 


5. Let A, B, and C be three noncoplanar vectors. Show that the representation 
of a vector D as given in (5-12) is unique. 
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6. Show that a plane is defined by 


10. 


11. 


12. 


{X | XK = Po + sA + tB, for all real s and t}, 


where A and B are a pair of noncollinear vectors. What is the equation of 
this plane? (This is called the parametric representation of a plane.) 


Let A = [a1, a2, a3], B = [bı, b2, b3], C = [c1, c2, c3] and D = [dı, d2, d3]. 
Show that the system of equations 


aiz + biy + cız = dı, 
agx + boy + coz = do, 
azt + bsy + c3z = d3 


is equivalent to the single vector equation 
xå + yB + 2C = D. 


What happens to this equation if we take the dot product of each side with 
BX C? Use this to solve for z. Do the same with A X C and A X B. 
Under what circumstances will a solution exist? 


. Prove that for any vectors A, B, C, and D, with A-B X C # 0, 


_ @D-BXC), A DXO 


(A-BX D) c 
— (A-BXC) (A-B X ©) 


B (A-BXC) ` 


B + 


. Suppose that A and B are two noncollinear vectors. Show thatif C = A X B, 


then A, B, and C are noncoplanar. 


Let A and B be noncollinear vectors. Rewrite formula (5-13) in terms of 
A, B, and D alone when C = A X B. Compare with (5-11). 


For each of the following, prove that the three vectors are coplanar and 
obtain C as a linear combination of A and B. 


(a) A = [2, 0, 1], B = [0, 3, 4], C = (8, —3, 0] 
(b) A = (1, 7, 2), B = [=1, ð, 1), C= (7, l; —10] 
(c) A = 1, 2, 0}, B = [1, l, 0), C = (1, =A, 0] 
(d) A = [1,1, 1), B = [1, 2, 3], C = [5, 0, —5] 


For each of the following, prove that A, B, and C are not coplanar and obtain 
D as a linear combination of A, B, and C. 


(a) A = (1, 0, 0], B = [1, 1, 0], 
C = (1, 1, 1], D= [5, 3, =I] 
(b) A = (1, 2, =]; B = [2, 0, 3], 
C = [6, —5, —4], D = [3, —12, —24] 
(c) A = [=1, 3, —1], B = [4, 2, 1], 
c = [3,5, 1], D = [—8, 24, —10] 
(d) A = [2, 7, 5], B = [—1, —8, 3], 
C = [1, 1, —5], D = [6, 3, 37] 
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We have defined a vector as a triple of numbers. This gives a natural 
representation of an arbitrary vector as a linear combination of the three 
special unit vectors e; = [1, 0, 0],e2 = [0, 1, 0], and es = [0,0, 1]. That 
is, 

[z, Y, 2] = ze; + Yez + zez. 


We remark that these are not the symbols most often used in vector 
analysis. The more common symbols are i, j, and k. We have avoided 
using these for two reasons. First, the letters 2 and 7 already cause enough 
confusion, since they are used as symbols with several different meanings 
in mathematics, physics, and electrical engineering. But a much more 
important reason is that we are looking forward to a generalization of 
three-dimensional vectors to a higher number of dimensions. Most of the 
formulas we have derived will have natural generalizations, but the symbols 
i, j, and k would have to be replaced by others (normally e;, €9,... , en). 

Theorem 5-10 of the previous section shows that any three noncoplanar 
vectors can be used to express arbitrary vectors in space. Suppose we had 
three noncoplanar vectors u,;, U2, and u3, and 


A = a,uU, + a2U2 + a3U3, 
B = b,u,; + bott2 + bgusz. 


Then using the linearity of the dot product, we obtain 
3 3 
B = > 2; a,b;(u;- u;). 
i=1 j=l 


This summation contains nine terms in all. It would be much simpler if 
the vectors u; were mutually orthogonal. Then u;-u; = 0 if ¢ ¥ j, and 
we would have 
3 
ae jb, |u,|?. 


Again, this formula would be simplified if each u; had magnitude 1. In 
this case, 
3 
B = > a,b;, 
i=1 


the same formula we had for the dot product in terms of the expansion by 
the natural unit vectors, e;, €2, and e3. 
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Definition 5-3. A set of three nonzero vectors, U,, U2, and ug is called an 
orthogonal set if and only if u;-u; = 0 for all z ¥ 7. It is called an 
orthonormal set if and only if it is an orthogonal set and in addition 
lu;| = 1 for all z. 


Note that according to this definition, the unit coordinate vectors e,, 
e2, and e3 are an orthonormal set. The following theorem is a direct 
consequence of this definition and shows how similar an orthonormal set 
is in its behavior to the unit coordinate vectors. 


Theorem 5-13. Let u,, us, us be an orthonormal set of vectors. Then this 
set of vectors is linearly independent. Given any vector A, 


A = (A-uj)u; + (A> ug)u2 + (A- ug)us. (5-16) 
If A= a,uUy, + aque -} a3U3 and B = biu: + bots + b3u3, then 
A.-B = a,b + Aobs + a3b3. (5-17) 


Proof: To see that an orthonormal set is lmearly independent we merely 
need observe that if 
AyUy + AgUe + Agus = O, 


then 0=0-u, 


= (MU, + AgUe + Agus) > Uy 

= Ay (Uy + Uy) + Ag(U2 + U1) + Ag(Ug - U1) 

= A}. 
In exactly the same way we can compute Az and Az; to be equal to zero. 
Hence the three vectors are linearly independent. But then, as we saw in 


Theorem 5-10, any vector A can be expressed as a linear combination of 
the three orthonormal vectors, 


A = atı + az2U2 + aguz. 
However, we may then compute 
A- u; = (aU, + a2U2 + a3u3) * U1 = ay 


just as above. Similarly we find A uz = a> and Auz = a3, which 
proves the second assertion of the theorem. The final part was done above. 


The question which this theorem immediately raises is: can we produce 
an orthonormal set of vectors from any three linearly independent vectors? 
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Thinking geometrically it is quite obvious how this must be done. Suppose 
we have three vectors A,;, Az, and A3 which are linearly independent. It 
is easy to produce a vector with magnitude 1 in the same direction as 
A,, namely u; = A,/|A;|. Clearly, it is possible to find a vector orthogonal 
to u; in the plane of A; and A». Such a vector is Bg = (A; X Ag) X A. 
We then can set uz = B2/|B2|. To get a third vector orthogonal to both 
u, and Use, we only need to take u,; X up. This process can be simplified 
somewhat. Since the vector u3 is merely required to be orthogonal to the 
plane of A; and Ag, we could just as easily set 


A, X Ae 


Ug = ) 
|A; X Aol 


and then uz = u3 X uı. The reader should note how the requirement 
that A,, As, and Ag be linearly independent implies that the quantities 
|Ai|, |B2|, and |A; X Ag| be all nonzero. 

Therefore, we see that we have actually proved: 


Theorem 5-14. Let A; and A, be a given pair of noncollinear vectors. 

Let 

Ar 

[Ai] 

ea A; X Ao . 
|A X Aol 


u2 = U3 X Uj. 


u; = 


(5-18) 


Then u;, u2, and u3 form an orthonormal set with u, parallel to A; and 
us coplanar with A; and Av. 


For example, suppose we have the vectors A, = [6, —3, 2] and A = 
[8, 3, 5]. Then |A,| = [36 + 9 + 4]!/? = 7, and so 
u, = [$, 3, 4]. 


Also 
Aı X A = [—21, —14, 42] 


= 7[—3, —2, 6], 
and hence |A; X A2| = 7-7 = 49. Thus 
u; = [—#, —%, 4], 
and taking the cross product, we have 


uz = [F, $, $]. 
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The reader should verify for himself that these three vectors do indeed 
form an orthonormal set. 


We can now discuss coordinate systems. In our definition of points in 
space as triples of numbers, we assumed an intrinsic coordinate system. 
However, we feel that the points of our space have an independent ex- 
istence, without regard to the particular coordinate system that we happen 
to use. Similarly vectors, which are thought of as a property of directed 
line segments, independent of translation, cannot really depend on the co- 
ordinate system either, despite the way we defined them. 

We discussed the problem of translation earlier and saw that no real 
change in the properties of space resulted from translation of the co- 
ordinate system (which is algebraically equivalent to translation of the 
entire space). Translation of the coordinate system in space leaves the 
representation of vectors as triples of numbers unchanged, of course. 

We still have the problem of rotation of coordinate systems to consider. 
In this regard, it is best to make the identification between vectors and 
points of space and consider both simultaneously. That is, we consider 
all vectors to be directed line segments with initial points at the origin, 
and identify the terminal point of the segment with the vector. 

In this context, our initial choice of a set of mutually orthogonal co- 
ordinate axes corresponds to a choice of an orthonormal set of vectors: the 
unit vectors in the positive directions along the coordinate axes. The co- 
ordinates of a point (or the components of a vector) in terms of this 
coordinate system are then given by the three coefficients of these unit 
vectors in the expansion of the vector, for example, 


A = a,U,; + Ggql2 + aguz. 


Formula (5-16) of Theorem 5-13 shows how these coefficients can 
actually be obtained for a given new coordinate system. For example, the 
vectors 


“faa 





form an orthonormal set. If these are considered to define a coordinate 
system, say the x’y’z’-coordinate system (in that order), then any point X 
which initially has coordinates (zxz, y, z) will have coordinates (z’, y’, 2’) 
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in this new coordinate system. To find the relationship between these 
coordinates, we observe that 


X = [z, y,z] = zre, + yez + ze3 = r'u; + y'uz + 7z'u3, 


and from (5-16) we have 


1 
‘= (x. ay, 
Xx (X-u) = TE 
1 1 1 
'= (X; —- y + —z, 
f ea B “AB B’ 
= (X. U3) = t g — L — ES z. 


/6 /6 /6 


Formula (5-17) is of special interest in connection with the point of view 
of considering u,, U2, and ug as defining a new coordinate system. This 
formula says that the representation of the dot product of two vectors in 
terms of their components is the same no matter what orthonormal set 
of vectors is used to define the coordinate system. Despite the fact that 
our initial definition of the dot product was in terms of a given coordinate 
system, it is really independent of this choice. 

What about the cross product? It too was defined in terms of the 
intrinsic coordinates. Let us see what happens if we express vectors in 
terms of another orthonormal set. Suppose that 


A = a,U,; + QegU2 + agus, 
B = b Us + bots + b3U3. 


Then if we take the cross product A X B, using the linearity and the 
anticommutative property, we find 


A X B = (azb3 — a3b2)u2 X U3 + (aıb3 — a3bı)uı X uz 
+ (aibo = a2bı)Ui xX Uo. (5-20) 


Here, we have discarded the terms involving u, X ui, U2 X Ue, and 
u3 X u3 since they are zero. 

Now, we know that u; X uz is orthogonal to both u, and uz, hence 
u; X Us = cu3; but from (5-16) we see that c = u; X u2- uz, and so 


ui X U = (ui xX Ug: U3) U3. 
On the other hand, from identity (5-7) 
[uy x u.|* =o [ur |? |u2|? — (uy, ‘ U2)? = I, 


since u; - Ug = 0, so that u; X ug = +ug. 
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If ui X U2 = Ug (and uUi ° U2 X Us = +1), we say that U1, Uo, and 
u3 constitute a right-handed system of vectors. Let us assume that this is 
true for the moment. Then with the help of identity (5-1), we have 


Uz X Uz = Ue X (uy, X u2) 
= (Ug: Ug)U, — (u2 u1)u2 
= Uj, 
and similarly, 
u; X ug = uy X (uy X Ua) 
(uy -Ug)U, — (WU) < u1)u2 
= — Uv. 


Substituting these into (5-20), we find that if ui, u2, and u3 constitute a 
right-handed orthonormal system, then 


AxB= (a2b3 = Ag3bo) uy = (a,b3 = a3b,)U2 + (a;bo — a2b1)U3. 
(5-21) 


Comparing this with formula (4-10), we find that the expansion of the 
cross product in terms of any right-handed orthogonal coordinate system 
is the same. Thus the cross product does not depend on the choice of a 
coordinate system, so long as the chosen system is still right-handed. 

The student can easily verify that if the coordinate system is left- 
handed (that is, if u; X u = —ug3), then instead of (5-21) we find a 
formula for A X B which is the negative of that given in (5-21). 

Finally, we remark that if we have any three vectors which are not 
coplanar, then from Theorem 5-10 we can express any vector X as a linear 
combination of these. We can think of the coefficients of these vectors as 
being the coordinates of the vector X (or point X) in terms of an oblique 
coordinate system. The rectangular parallelepiped in Fig. 3-1 would be 
replaced by a parallelepiped whose edges are parallel to the given vectors 
(as in Fig. 5-3). While the oblique coordinates are useful in some cases, 
their use make the formulas for the dot and cross product very compli- 
cated. 


PROBLEMS 


1. Construct an orthonormal set out of the given vectors by the method 
described in this section. 


(a) Ai = (1, 3, 0), Az = [=L 1, 0) 
(b) Ai = (=I; 0, 1], A2 = [0, 2, 1) 

(c) Ai = (2, l, —3], Az = (1, =; 2] 
(d) Ai = (1, 2, 1), Ao = [—2, =; 2] 


2. Express the vector B = [2, 2, 2] as a linear combination of the orthonormal 
set ui, U2, and ug for each part of Problem 1. 


10. 
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. Prove that the right-hand side of (5-20) reduces to the negative of (5-21) 


if uy X ug = —usz. 


. Is the set of vectors defined by (5-18) a right-handed system? Does this fact 


depend on A; and A2? 


. Prove that the vectors (5-19) form an orthornormal set. Are they a right- 


handed system? 


. For each of the orthonormal sets of vectors obtained in Problem 1, find 


formulas for the coordinates (z’, y’, z’) of a point (2, y, z) when the 2’-, y/-, 
and z’-axes are determined by uj, u2, and us, respectively. 


. Show that each of the following sets of vectors is an orthonormal set. Give 


formulas for the coordinates (2’, y’, z’) of a point (x, y, z) when the 2’-, y/-, 
and 2’-axes are determined by ui, we, and uz, respectively. Express A = 
[5, —2, 1] as a linear combination of uj, ug, and uz. 


(a) ul = (0, $, $ |, u2 = (+3, —ia: Ts); u3 = 
(b) u; = [$, $, —$], u2 = [=$ $, ~al, 5 u3 = ls, $, $] 
(c) ui = [§, —3, 3], uz = [—iš -i5 Ts) us = if, —i5: —18] 


8. Let u1, u2, and u3 be a right-handed, orthonormal set of vectors. Set 


N = e X uı + e2 X u2 + e3 X us, 
X = xe, + yeo-+ zes, 

and 
Y = zu, + yu2 + zus. 


(a) Prove that the angle between e; and N is the same as the angle between 
u; and N forz = 1, 2, and 3. 

(b) Show that |X| = [Y|, and prove that the angle between X and N is the 
same as the angle between Y and N for any z, y, and z. 

(c) Prove that X = Y if X is collinear with N. [Hint: Use (5-16) on both 
X and Y.] 


Remark: A rotation of space which carries e; to 1, e2 to u2, and e3 to 
u3 can be thought of as a function which maps a point X to a point Y, where 
X and Y are given as above. The vector N is then the axis of the rotation. 
Property (b) shows that N is the axis, and property (c) shows that any 
rotation must have an axis. 


. Find the axis N as defined in Problem 8 for each of the sets of vectors in 


Problem 7. Make a sketch showing these vectors for each of the cases. 


Let uj, u2, and u3 be any three noncoplanar vectors (they need not form an 
orthonormal set). Define 


vı = c(u2 X us), v2 = c(u3 X u1), v3 = c(u1 X ug), 


where c = 1/(uı - u2 X u3). The v; are called a dual basis to the u;. 
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(a) Prove that 
0 ifi Æj, 
Ve uj = ar . 
1 ifi =}. 
(b) Prove that for any vector A, 


A = (A-vi)ui + (A - v2)u2 + (A - v3)u3, 
and also 
A = (A-uj)v1 + (A>: u2)v2 + (A : u3)v3. 


(c) Find the dual basis to A, B, and C in each of the four parts of Problem 12, 
Section 5-2. Use the results of part (b) of this problem to obtain D as a 
linear combination of A, B, and C. 


5-4 PROJECTIONS AND DISTANCE FORMULAS 


In Section 3-5 we introduced the projection of vectors in the direction 
of a line and we have made use of this idea several times in various applica- 
tions. The formula which was introduced as the definition is not par- 
ticularly easy to remember by itself. At this point, we would like to 
exhibit another way of looking at the projection. From this point of view, 
the formula for the projection can be rederived easily as needed. 


Let A and B = 0 be given vectors. We wish to find the projection of A 
in the direction of the line determined by B. The situation is illustrated 
in Fig. 5-4. Here, the required projection is P, and we see that the basic 
requirements are satisfied when V = A — P is orthogonal to B. In order 
that P be collinear with B, we must have 


P = tB (5-22) 


for some scalar ¢. But then, in order to have V orthogonal to B, we must 


have 
V-B=A-B—P-B 


= A-B — (B-B 
= 0. 





This will be true only if t = (A-B)/(B-B). Using this 
value in (5-22) yields exactly the formula given in Defini- Figure 5-4 
tion 3-16. 

This particular point of view can be extended to give us the definition 
of the projection of a vector onto a plane. 


Definition 5-4. Let A be a given vector and M a given plane. Then 
the projection of A onto M is a vector P which lies in the plane and is 
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such that 
A=P+V 


for some vector V orthogonal to M. 





The contents of this definition are illustrated Fuvu ss 
in Fig. 5-5. From this figure it is easy to see 
how we can obtain formulas for the projection of a vector onto a plane. 
Suppose N = 0 is orthogonal to the plane. Then it is clear that we want 
V to be collinear with N to satisfy the requirements of Definition 5-4. 
Likewise, it seems clear from the figure that V must actually be the pro- 
jection of A in the direction of N. Using this would give 





P=A-—V 
-a AN 
=A- 0N N 


This result has been obtained by means of our geometric understanding 
rather than by formal computation. However, once we have obtained this 
formula, it is easy to verify it. Direct computation shows that if P is 
defined in this way, then P- N = 0 and hence that P is parallel to the 
given plane. Thus, we have proved: 


Theorem 5-15. Let M be a plane orthogonal to a vector N = 0. Then 
the projection of a vector A onto M is given by 


A-N 
P=A- SON. (5-23) 





For example, let us find the projection of A = [2, 5, —1] onto the plane 
with equation 3x — y + 22 — 5=0. The normal vector to this plane 
is N = [3, —1, 2]. Here, IN|? = 14andA-N = —1. Hence 


P= [2, ð, =N T qal3, sou; 2] 
= [#4, $3, —ł4]. 


In Section 4-3 we derived an expression for the distance from a point 
to a plane. We now wish to find expressions for the distance between a 
point and a line and for the distance between two lines. 

Suppose that we are given a line L which contains the point A and has 
the direction B. Thus the line L has parametric equation 


X = A-+ JB. 
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We wish to find the distance from a point P to the line L. From geo- 
metric considerations it is clear that the required distance is exactly the 
length of the vector PC from P to a point C on the line which is such that 
PC is orthogonal to B. This can be verified by noting that if D is any 
point on the line, then (see Fig. 5-6). 


PD = PC + CD. 
Since PC is orthogonal to B and CD is 


collinear with B, PC and CD must be orthogonal. 
But then 





FIGURE 5-6 


[PD]? = [PĠ]? + CD], — (5-24) 


as can be shown by direct calculation or by reference to Theorem 3-8. 
(The fact that the square of the magnitude of the sum of two orthogonal 
vectors is the sum of the squares of their magnitudes is exactly the 
Pythagorean theorem.) 

By letting the point D vary along the line L, we see from (5-23) that 
the minimum distance |PD| is attained when C = D, and thus [PČ] is 
the distance from P to L. 

To find PC we can proceed in any one of several different ways. Perhaps 
the easiest way is to note that 


PC = AC — AP. 
(We remark that the reader frequently will find it easier to obtain an 


equation such as this by noting first that AP + PC = AC.) The vector 
AC is the projection of AP onto the direction of B, and hence 


—> (AP -B) 
giving 
—+ (AP-B) E 


To find the length of PC, we can calculate 


|PC]? = (PC) - (PC) 
(AP - B)? 


AP . B)? — 
=- Mpp P’ -2 pe + IAP}? 
_ ape _ (AP: B)? 
= [AÈ B 
_ [AË x B|? 

-~ BP 


where the last step follows from Eq. (5-7) of Section 5-1. 
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This same result can also be obtained by observing that if @ is the angle 
between AP and B, then 
[PC| = |AP| sin 0, 
while 
[AP x B| = [AP |- |B| sin 9. 


Either way, we have proved 


Theorem 5-16. Let L be the line through A with direction B, and let 
P be a given point. Then the distance from P to L is d, where 


[AP x B| 
d= —_— : 5-25 
BI (5-25) 
The point C on L which is closest to P is given by 
cuart Bp (5-26) 


B|? 


It should be noted that the distance d given by (5-25) will be zero if 
and only if the point P is on the line. 
As an illustration of the use of these formulas, let us find the distance 


between the line 
X = [2, 3, —5] + t[l, =; 2] 


and the point P = (15,7, 6). Here, B = [1, —2, 2], and AP = [13, 4, 11]. 
Starting with (5-26), we see that the projection of AP in the direction of 
the line is 

AP-B 27 


AC = ^p B = 9B = (3, —6, 61, 





Hence C = A+ AC = [5, —3, 1], and 


d = |PC| = |[10, 10, 5] 
O° I[2, 2, 1]| 
= "9 +5 1D; 


Alternatively, we could compute 
AP x B = [30, —15, —30], 
and, using (5-25), find 
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Now, we wish to turn to the problem of finding a formula for the dis- 
tance between two lines. Let Lı be the line through A, with the direc- 
tion B,, and let Lə be the line through A» with direction Bə. Suppose 
that there exist points Cı and Cə on Lı and Log, respectively, such that 
CC. is orthogonal to both L, and Ly. Then |C,C2| is the required dis- 
tance between Lı and Ly. This is easily seen by considering the planes 
orthogonal to CC. through A, and Ag, respectively. The lines Lı and 
LI, will be contained in these respective planes (why?), and hence the 
distance between these planes will be the minimum distance between the 
lines. (See Fig. 5-7.) 

The vector CĈ is clearly the projection of the vector A,A 2 ìn the 
direction of CC 2. But since CC is assumed to be orthogonal to both B, 
and B», it must be collinear with B, X B» (assuming that B, and B» are 
not collinear). Therefore CC 2 1s the projection of AÀ in the direction 
of B; X Bo, and hence bass 

ee A |A,Ag $ B; X B,| 
[CCa] = ~ B, xB] ` (5-27) 

This result was obtained under the assumption that points Cı and Co 
with the required properties exist. The existence of these points can be 
verified in several different ways, but with the proper point of view it is 
easv to see. 


B; 





Ci Lı 


FıGure 5-7 FIGURE 5-8 


We suppose that B, and Bz are not collinear. Then B; X Bz = 0, and 
we can imagine the planes orthogonal to Bı X Bz through A, and Ag. 
The line L, is in one of these planes and the line Lg is in the other. If 
we make the orthogonal projection of the line Lz into the plane con- 
taining Lı, we see that Lı and the projection of Lz must intersect. The 
point of intersection will be the point C, (Fig. 5-8). 

This rather intuitive argument can be made rigorous quite easily. Intro- 
duce an orthonormal set of vectors u;, U2, and us such that u, 1s parallel 
to B,, tu, is in the plane determined by B, and Bz, and uz is collinear 
with B, X Bo, by the method of the last section. Then we have 


B, = 6,u,, Bo = CU + C2U2, (5-28) 
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where 6; = 1/|B,| = 0 and c2 ¥ 0, since otherwise B,; and Bz would 
have been collinear. The lines Lı and Lə have equations 


xX = A, + tB,, Y= A> + sB», (5-29) 


respectively. If X is an arbitrary point on Lı, having coordinate t, and 
Y is an arbitrary point on Ls with coordinate s, then 


—> 


XY = A> — A, + sBo = tB; 
= (A2 — Aı) + (scı — tbı)uı + sc2u2. 


Let us suppose that A — A; = a,u, + aout, + azuz. Then 
XY = (a, + sey — thy)u; + (az + scg)ug + agus, 


and since B,; X Bz is parallel to u3, the problem is whether or not we can 
find values for s and ¢ so that a, + sc, — tb; = 0 and a2g+ sco = 0. 
If so, these values when put into (5-29) would give us the required points 
Cı and C2. 

However, the second equation, 


a2 + sco = 0, 


can be solved for s, since from (5-28) the number cz is not zero. Having 
solved for s, we can solve for t in the first equation, 


a; + sc; — tb; = 0, 


since bı ~ 0. Therefore the required points Cı and Co exist. 

Note that in this proof we are not particularly interested in finding 
Cı and Cə. All we are really after is the information that these points 
exist. With this knowledge, the rest of the calculations are easy. 


Theorem 5-17. Let L, be the line through A, in the direction B,, and 
let Lz be the line through Ag in the direction Bz. If B,; and B3 are not 
collinear, then the distance between L, and Lg is 


— |4142: Bı X B], 


e 
q= B, x Bal (5-30) 


If B; and Bz are collinear, then the distance between L, and Lo is 


= |A1A2 X Bil. 


d 
|B, | 


(5-31) 
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The proof of the second part of this result will be left as one of the 
problems. It should be noted that the distance d given in this theorem will 
be zero if and only if the two lines intersect. 

As an example, let us find the distance between the lines 


X = [1, 0, 5] + ¢[2, 0, 3] 
and 
X = [4, —1, 2] + t[1, —1, 0]. 


Here, B, = [2, 0, 3] and B, = [1, —1, 0]. Hence Bı x Bz = [8, 3, —2]. 
Also, A;A2 = [3, —1, —3]. Therefore, the required distance is 


_9-—38+6_ 12. 


(22)1/? /22 


PROBLEMS 


1. For each of the following sets of vectors, find the projection of A3 onto a 
plane parallel to both a and Ag. 


(a) Ai = 1, 3, 0], = [= 14, 0], A3 = (1, 2, 1] 
(b) Ai = [—1, 0, 1], = [0, 2, 1), A3 = (1, l, 0] 
(c) Ai = [2, 1, —3], = fi, —1, 2], A3 = [3, 0, 4] 
(d) Ai = (1, 2, 1], =. |= =a a 2], A3 = (1, l, 5] 


2. For each of the sets of vectors in Problem 1, find the projection of A3 onto a 
plane orthogonal to Aj. 


3. Find s and ¢ such that 
XY = Ao — A, + sBo — By 


is orthogonal to both B,; and Be by direct calculation (without introducing 
the vectors u1, U2, and u3). Show that the result can be written in the form 


_ [(A1 — Az) X Bi] - [Bi X B2] 


|B1 X B2|2 
į = — Ar — Az) X Bo] - [Bi X Ba], 
[Bi X Bol? 
4. Set 
A, = (3, 2, 4), = [1, 0, 1], 
A2 = (1, —3, 1), i = [0, —9d, 3], 
A3 = (—2, 1, 2), B; = [7, 5, ie 
A4 = (2,1, =l Bs = [— =ð, =l]; 
A5 = (5, 5, 10), B; = 1, 0, ii 
Pı = (1, 3, 5), P2 = (0, 5, 0), 
P3 = (5, 4, 3), P4 = (—3, 0, 1). 
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Let Lı be the line through A, in the direction B,, Le the line through -12 in 
the direction Bg, etc. 

(a) Find the distance from P; to L;, fort = 1, 2, 3 and 4. 

(b) Find the distance from Lı to each of the other lines. 


5. Using Theorem 5-16, prove the second proposition of Theorem 5-17. 


6. Let C be a given point and M a given plane. Let L be the line through C 
orthogonal to M. Then by the projection of C onto M we will mean the point 
D at which the line L cuts the plane. If M is the plane through the point A, 
orthogonal to B, prove that the projection of C onto M is the point D, where 


_ a (AČ-B) 
Do Bp 


[Hint: Consider Fig. 5—5.] 


B. 


7. In what way is Problem 6 related to the problem of finding the point on a 
given plane which is closest to a given point? 


5-5 GENERAL METHODS* 


The results of Sections 5-2 and 5-3 are of great importance in the 
general study of vector spaces. These results, however, were obtained with 
the aid of the properties of the cross product. The cross product is an 
artifact of the three-dimensional vector space which does not exist (in the 
same form) in-spaces of other dimensions. In this section, it is our purpose 
to show how certain results obtained in Sections 5-2 and 5-3 can be 
obtained by methods which do not involve the cross product. 


The first topic we wish to discuss is linear dependence. Recall that in 
Definition 5-2, a collection of vectors, A;, Ag,..., An, was called linearly 
dependent if and only if there exist scalars à; (îi = 1,2,...,7), not all 
zero, such that 

AiAi + AoAo + -ee H nån = O. (5-32) 


A set of vectors which is not linearly dependent is called linearly in- 
dependent. Let us now investigate some properties of this concept. 


Theorem 5-18. If a collection of vectors contains the zero vector, then 
the collection is linearly dependent. 


* The material discussed in this section is not essential to this course, but 
it does constitute an introduction to some very important topics taken up in 
later courses. It would be well worth the student’s while to study this material. 
In particular, students interested in applications of mathematics are advised 
to study the proof of Theorem 5-24 with care. 
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Proof: If the collection is A,, Ag,..., An—1, 0, then 
AA + AgAg + +++ + An—1An—1 + An0 = 0 


when we set Ay = Ap = °°: = An_-1 = O and A, = 1. Here, not all 
of the \; are zero, since A, = 1. 


Theorem 5-19. If a collection of vectors contains two identical vectors, 
then the collection is linearly dependent. 


Proof: Let all of the à; in (5-18) be zero except for the coefficients of the 
two identical vectors. Let one of these have the coefficient +1 and the 
other the coefficient —1. We then have a linear combination of the 
vectors which is zero, but with some nonzero coefficients. 


Theorem 5-20. If a set of vectors is linearly independent, then any 
nonempty subset of this set is also linearly independent. If a collection 
of vectors is linearly dependent, then any enlarged collection is also 
linearly dependent. 


Proof: The two parts of this theorem are logically equivalent, so let us 


prove only the second part. Let A;, Ao, ..., An be a linearly dependent 
collection of vectors. Then there exist à; (t = 1, 2,..., 7), not all zero, 
such that 


AiAi + AA + -te H nån = O. 


If the enlarged collection is A;, Ao,..., An, Bi, Bo, ..., Bx, then letting 
the à; be the same as above (and hence not all zero), and all of the u; be 
zero, we have 


ArAy + +++ + AnAn + miBi + +++ + Be = 0. 


Definition 5-5. A collection of vectors A,, Ao,..., An is called a set of 
generators if and only if every vector can be expressed as a linear com- 
bination of these. That is, 1f B is any vector, there exist scalars \; such 
that 

B = MA; + A2 + +++ + AnAn. 


Directly from this definition we are able to prove 
Theorem 5-21. If A,, Ao,..., A, isa set of generators, and if B is any 


vector, then the collection of vectors B, A;, Ao, ..., An 1s linearly 
dependent. 
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Proof: By the definition, there exist scalars \;,..., An such that 


B = \ Ay +---+ ),A),. 
But then 
B — \ Ay = oe AnAn = 0, 


and this collection is linearly dependent (the coefficient of B is 1 = 0). 


Theorem 5-22. Let A,, Ao,..., An be a linearly dependent sequence of 
nonzero vectors. Then there is some k with 1 < k < n such that the 
sequence of vectors A,,..., A, is linearly independent and A;,4,; is a 
linear combination of these. That is, there exist scalars \;, \2,..., Ax 
such that 


Anyi = AyA1 + AgAg + °°+ + AgAk. 


Proof: We look at the sequence of vectors, and let k be the smallest 
integer for which the sequence Ay, Ag, ..., Ax4, is linearly dependent. 
Since A; is nonzero, k must be greater than or equal to one. On the 
other hand, since the entire sequence is linearly dependent, k is less than n. 

Since k is chosen to be the smallest integer such that the sequence 
Aı,..., Akı is linearly dependent, there exist scalars a;, æ2, .. . , Qk+1, 
not all zero, such that 


ayA;, + agAs + +++ + akpiıåÅk+4ı = 0. 


However, ax4;, ~ 0, for if a,4,; = 0, then not all the a; being zero would 
imply that the sequence A,, ..., Az is linearly dependent, contradicting 
the way k was chosen. This equation can then be divided through by 
a4, and brought into the form required by the theorem. 


Finally we prove 


Theorem 5-23. Any four vectors are linearly dependent; and if three 
vectors are linearly independent, they are a set of generators. 


Proof: If we are given four vectors, and if three of them are linearly 
dependent, then the whole collection is linearly dependent. On the other 
hand, if three of the vectors are linearly independent and we prove that 
they are a set of generators, then Theorem 5-21 shows that all four are 
linearly dependent. So suppose that A;, As, and Ag are linearly independ- 
ent. We will prove that they are a set of generators. 

The method of proof that we use is called the method of replacement. 
We start with a known set of generators, ei, €2, and e3 (why is this a set 
of generators?) and add A, to this set. By Theorem 5-21 the resulting 
collection of four vectors A), €;, €2, and e3 is linearly dependent. I`rom 
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Theorem 5-22, we find one of these to be a linear combination of the 
previous ones. 
Suppose, for example, that we can solve for es, finding 


€3 = AyAi + Az€i + Ase. 


But now, we can remove e3 from the collection Aj, e1, €2, e3 and still have 
a set of generators. For if B is an arbitrary vector with 


B = py€) + M22 + M33, 
then 
B = pe, + H22 + u3 (NÅ; + 2€, + `3€2) 
= u3ňħıÅ; + (mı + u3ň2)e1 + (u2 + u3ň3)e2 
= pÂ, + woei + Zee. 


Next we add Ag, to the collection and look at the sequence Ag, Aj, €1, €2 
assuming e3 was the one removed). The same reasoning shows that we 
can remove another of the vectors. But the one removed will not be 
A, or Ag, since these two are linearly independent. We therefore have a 
sequence A,, Ao, e; (where e; is one of the original three) which is still a 
set of generators. 

Finally we add Ag to the collection, getting As, Az, Ai, e;. This time, 
when we remove a vector, it must be e;, since the collection A3, Ag, A, is 
linearly independent. We are therefore able to conclude that the three 
vectors are a set of generators, and the theorem is proved. 


The reader should study carefully the process used in this proof. We 
start with a set of generators. When another vector is added, the resulting 
collection must be linearly dependent. Hence one of the original vectors 
can be removed and still leave a set of generators. A vector is pushed in 
at one end, forcing one out at the other. Observe also that in this proof 
we have made use only of the definitions, the properties of vectors as found 
in Theorems 3-3 and 3-4, and the fact that there are three special vectors 
(e1, €g, and es) which generate the entire space. Theorem 5-23 should be 
compared with Theorem 5-12. The two are essentially equivalent. 

Next, we add to the above assumptions the properties of the dot product 
as given in Theorem 3-5, and show how we can obtain a close approxima- 
tion to Theorem 5-14 without having to use the cross product. 


Theorem 5-24. Suppose A,, Az, and Az are linearly independent. Then 
there exists an orthonormal set ui, u2, and u3 such that wu, is a scalar 
multiple of A,, Ue is a linear combination of A, and Ag, and uz is a 
linear combination of A,, Ag, and A3. 
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Proof: Since A;, Az, and Ag are linearly independent, none can be the 
zero vector. We start by setting 


u; = A,/|A\| 
just as in Theorem 5-14. 

Next we wish to find uz coplanar with A, and Az, and orthogonal to 
u,. Refer back to Fig. 5-4 for a picture of what we wish to accomplish. In 
this figure, we let B = u,;, and A = Ag. Then the orthogonal vector V, 
which we call V2, must be | 


Vo = Ap — (A2 u1)un. (5-33) 


This same result can also be obtained in a purely algebraic manner. 
The desired vector V2 is to be a linear combination of A», and u,. Since 
it is to be orthogonal to u, it cannot be collinear with u,. We can therefore 
assume that the desired vector is of the form 


Vo = Ao + tui. 


To find ¢ so that Və is orthogonal to u, we set Vo: u; = 0. This gives 
A>: u; + t = 0, and hence ¢ = —(A2- uj), resulting in the same vector 
V> as obtained in (5-38). 
The vector V2 is orthogonal to u,, but it is not of unit length in general. 
Therefore, we set 
Us = |Vo| / Vo. 


We can continue this process in a similar manner to find a vector orthog- 
onal to both u; and us. The desired vector can be determined with the 
help of the concept of projection (see Fig. 5-5) or directly in the following 
way. The vector we wish to find is to be a linear combination of Ag, uy, 
and us. Hence we can assume 


V3 = A3 + SU + tus. (5-34) 


We can then solve for s and ¢ by using the conditions that V3 is orthogonal 
to u, and ug. Setting Vz : u; = 0 in (5-34) gives 


A3;°-u,; +s = 0. 


Similarly, setting V3 -u, = 0 gives 


| 
> 


A3 °. Uo + 
Hence we find that 


V3 = A3 = (A3 ° ujui a (A3 . Ug) Uy. (5-35) 
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Note that this is merely Ag minus the projection of A3 onto the plane 
determined by u; and uz. The required vector of length 1 is then finally 
given by 

V3 


uz = ; 
[V3] 


The process that we have gone through here is of great theoretical (and 
in many cases, practical) importance. The method produces an ortho- 
normal set of vectors successively from a given set of vectors by sub- 
tracting from each vector its projection on the “plane” determined by the 
previously determined vectors. Equations (5-33) and (5-35) serve to show 
how the method proceeds. The process in no way depends on the fact 
that our vectors are three-dimensional, and can be extended to any vector 
space with an inner product. It is known as the Gram-Schmidt orthog- 
onalization process. 


6 


The Conic Sections 


6-1 THE DEFINITION OF CONIC SECTIONS 


In their study of geometry, the early Greek mathematicians gave special 
consideration to those curves which could be obtained by cutting a right 
circular cone with a plane. Such curves occupy an important position 
throughout geometry and analysis, and their properties must be well known 
to any student planning to do further work in mathematics. 

To start our discussion of the conic sections, let us first find the vector 
form of the equation of a right circular cone. Three things are required in 
order to define a cone. We must specify a point to be the vertex of the 
cone (A in Fig. 6-1), a vector N to define the direction of the azis of the 
cone, and an angle 6 to be the half angle of the cone. A point X = (z, y, z) 
is on the cone if and only if the vector AX makes the angle 6 with N or 
—N. This is the same as 


(X — A) -N| = A|X — A|- |N], 


where \ = cos 0. The absolute value on 
(X — A) - N is needed to give us both sides 
of the cone. If the absolute value signs were 
left off, what points X would satisfy this 
relation? 





FIGuRE 6-1 


Definition 6—1. Given a point A, a nonzero vector N , and a real number 
A with O < à < 1, a right circular cone is 


{X = (x,y, 2)| |& — A)-N| = AX — Al- [N}}. 


The point A is called the vertex of the cone; the line X = A + tN is 
called the axis of the cone; and the angle 0 with O < 0 < m/2 such 
that cos 0 = A is called the half angle of the cone. 
The two sets 

{X| X — A): N = AX — A| - IN), 

{X | (X — A)-N = —)|X — Al- |N]} 


are called the nappes of the cone. 
179 
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If Xo is any point on the cone other than the vertex, then the line 
X = A + ¢t(Xo — A) is called a generator of the cone. 


Although the set defined here is correctly known as a right circular cone, 
we shall just call it a cone for the time being. Since only the direction of N 
is needed to determine the axis of the cone, we can set B = N/|N| and use 
this vector of unit length to determine the axis. That is, if A = (aı, a2, a3) 
and B = [b,, be, bs], with |B] = 1, the above relation can be written as 


|bs(z — ay) + boy — ag) + b3(z — ag)| = A(x — ay)? + (y — a2)? 
+ (2 — a3)?]*/?. 


To eliminate the absolute value signs, we can square this equation to 
obtain 


[bi(z — a1) + baly — a2) + b3(z — ag)]? = rA*[(x — ay)? 
+ (y — a2)? + (z — az)’]. (6-1) 


This equation is satisfied by the coordinates of each point X on the cone, 
and if the coordinates of a point satisfy this equation, then that point is 
on the cone. Therefore, if we specify that b? + b3 + b2 = 1, then (6-1) 
is the general form of the cartesian equation of a cone. 

In order to study the intersection of a cone with a plane we merely have 
to choose a plane and find the points common to this plane and the cone. 
The most convenient plane to use is of course the zy-coordinate plane 
(z = 0). If we set z = 0 in the equation of the cone, we are left with an 
equation containing only x and y. This will then be the cartesian equation, 
in the zy-plane, of the conic section. 

Setting z = 0 in the equation of the cone given above, expanding and 
rearranging terms gives an equation of the form 


Az? + Bry + Cy? + Dr + Ey + F = Q. (6-2) 


We can conclude that every conic section in the zxy-plane satisfies a general 
quadratic equation. 

To make a more careful study of the conic sections we need to simplify 
the equations somewhat. This can be done by proper adjustment of the 
parameters determining the cone. Thus in the position A = (aj, Qo, a3) 
of the vertex of the cone, we can change a, and dz in any way we may 
find useful. Such changes merely translate the conic section on the plane. 
Likewise the direction of the axis B = [b;, bo, b3]| can be altered as desired 
by changing bı and bə (keeping |B| = 1). The effect on the conic section 
will be a rotation in the ry-plane. Since b3 is the cosine of the angle between 
B and the z-axis, fixing it determines the angle between B and the’ zy-plane. 
(See Problem 13, at the end of Section 4—4.) 
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Let us set ag = 0 and be = 0. This corresponds to assuming that the 
vertex of the cone is in the xz-plane and that the axis of the cone is in this 
plane also. Later, we will adjust a; as needed to give the greatest simpli- 
fication. Note that when bz = 0, bı is the cosine of the angle between B 
and the zy-plane. 

Substituting these values in the cartesian equation of the cone and 
setting z = 0, we find, after some algebraic manipulation, the equation of 
the conic to be 


(x? — b3)a? — 2a (A? — b?) — bibzasle + Xy? 
= 2b,bsa,a3 — ai(à? — bi) — a3(à? — b3). (6-3) 


We will consider various cases of this equation as bı is varied while 
holding fixed ^, the cosine of the half angle of the cone. Changing bı 
corresponds to “tipping” the cone. To fix our picture we will assume that 
bi > 0, b3 > 0, and a3 < 0. 

The first case we consider is that when b; = 0. z 
Then, since b? + b3 = 1 and bs > 0, b3 = 1. 
For this case, we set a} = 0. Then Eq. (6-3) 
becomes 


r? + y? = all — a’), 

or 
2 

y= aA. 64) 
This is recognized as the equation of a circle 
(note that A? < 1), which is as it should be, 
since with b, = 0, the cone has its axis orth- 
ogonal to the xy-plane (Fig. 6-2). 

Let ¢ be the angle between the axis of the cone and the xy-plane. Then 
cos @¢ = b,. Since the axis of the cone is in the xz-plane, ¢ is also the 
angle between the axis of the cone and the x-axis. In Fig. 6-3 we show the 
cross section of the cone in the zz-plane for several possible cases. Letting 
6 be the half angle of the cone, it is clear that the “bottom” generator of 
the cone makes an angle of @ — 6 with the z-axis. 

Therefore, if 6 < ¢ or equivalently, if 0 < b; < X (since the cosine 
decreases as the angle increases), all generators of the cone are pointing 
“upward” and the zy-plane cuts through a single nappe of the cone. The 
resulting curve is called an ellipse (see Fig. 6-3a). 

In this case, we have 0 < A? — b? < X?. Set A? — b? = uw”. To sim- 
plify Eq. (6-3) as much as possible we set a, = b,b3a3/u”. That is, we 
move the vertex of the cone so as to obtain the greatest simplification. 





FIGURE 6-2 
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FIGURE 6-3 





(c) 


The result is shown in Fig. 6-4. Equation (6-3) then reduces to 
ur? + ry? = K. (6-5) 


Here, the right-hand side of (6-3) reduces to a constant which we have 
written as a positive constant k?. To see that this is possible we merely 
need to observe that the cone must cut the zy-plane somewhere, hence 
there exist points (zx, y) which satisfy (6-3). But the left-hand side of 
(6-3) reduces to the left-hand side of (6-5), which is positive for any 
(x, y). Hence the right-hand side must be a positive constant. 

The next case to be considered is that for which b; = àA. Here, the half 
angle of the cone is the same as the angle between its axis and the z-axis. 
Hence, exactly one of the generators of the cone will be parallel to the 
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FIGURE 6-4 FIGuRE 6-5 


z-axis (Fig. 6-3b). The intersection will be a single curve which never 
closes. This conic section is called a parabola. Inserting this value into 
Eq. (6-3), we find that 


2b 1b3a3x + Vy? = 2b 1630103 m az(r? ac b2). 


In this equation b,b3a3 = 0 except for certain limiting cases. We may 
adjust a; so that the right-hand side of this equation is zero. Then, dividing 
through by °, we find that the equation of the parabola has the form 


y? = kz, (6-6) 


where k is some nonzero constant (Fig. 6-5). 

The final case to be considered is that when 0 < à < bı. In Fig. 6-3(¢) 
we observe that under this condition the plane z = 0 will cut both nappes 
of the cone. The conic section thus consists:of two parts. The resulting 
curve is called a hyperbola. 

The coefficient of x? in (6-3) is negative, so we set \2 — b? = —p?. 
Then we can set a; = —b,b3a3/p” to eliminate the coefficient of the x 
term. The left-hand side of (6-3) then becomes —p?xr? + A?y?. From 
the geometric position of the cone, it is easily seen that there must be two 





FIGuRE 6-6 


184 THE CONIC SECTIONS 6-1 


distinct points on the z-axis which are on the hyperbola. Since at least 
one of these is not zero, we see that the constant on the right-hand side of 
(6-3) must be negative. That is, the hyperbola must satisfy an equation 
of the form 

— px? + r2y? = —k?. (6-7) 


This case is illustrated in Fig. 6-6. 

In the above discussion b, is the cosine of the angle between the axis 
of the cone and the x-axis and at the same time the cosine of the angle 
between the-axis of the cone and the zy-plane. Therefore, each of the conic 
sections considered above satisfies the following formal definition. 


Definition 6-2. A nondegenerate conic section is the intersection of a 
right circular cone with a plane which does not pass through the vertex 
of the cone. Given such a cone and plane, let 6 be the half angle of the 
cone and let ¢ be the angle between the axis of the cone and the plane. 
Then the resulting conic section is called: 


(1) a carcle if and only if ¢ = 7/2, 

(2) an ellipse if and only if 7/2 > ¢ > 9, 
(3) a parabola if and only if @ = 9, 

(4) a hyperbola if and only if 6 > ¢. 


The calculations of this section can then be summarized in the following 
theorem: 


Theorem 6-1. The intersection of a right circular cone with the zy-plane 
is the locus of an equation of the form 


Az? + Bry + Cy? + Dr + Ey + F = 0. 


Let \ be the cosine of the half angle of the cone and let bı be the cosine 
of the angle between the axis of the cone and the plane. Set u? = 
IA? — b?|. Then for a suitable location of the cone with respect to the 
x- and y-axes, an ellipse, a parabola, or a hyperbola will be the locus 
of the equation 

2x? + 2y? = k?, 

y’ = kz, 
or 
—p?r? + 2y? = —k?, 


respectively, where k is a nonzero constant whose value depends on the 
particular conic section in question. 


6-2 EQUIVALENT DEFINITIONS 185 


We remark that the size and shape of the conic section is determined 
completely by three quantities: the half angle of the cone, the angle between 
the axis of the cone and the plane, and the distance between the vertex 
of the cone and the plane. The phrase, “suitable location of the cone,” 
found in this theorem should be interpreted as allowing the cone to be 
moved in any way which does not change any of these three quantities. 


PROBLEMS 


1. Write the cartesian equations of each of the following cones: 
(a) The cone with vertex at (0, 1, 1), axis parallel to the z-axis, and half 


angle 60° 

(b) The cone with vertex at (0, 0, 2), axis parallel to B = [2, 0, 1], and ‘half 
angle 45° 

(c) The cone with vertex at (0, 0, 1), axis parallel to B = [1, 0, 1], and half 
angle 45° 


2. Find and simplify as much as possible the cartesian equation of the conic 
sections obtained as the intersections of the cones of Problem 1 with the 
zy-plane. Identify each conic section as to type. 


3. Prove algebraically that the right-hand side of (6-3) is positive when 
0 < A2 — b? < d2 and when we put a; = b1b3a3/(A2 — 0%). 


4. Let A be the vertex of a right circular cone and let Xo be any point other than 
A on the cone. Prove that every point of the line KX = A + t(Xo — A) is 
on the cone. 


5. The intersection of a cone and a plane which passes through the vertex of the 
cone is a degenerate conic section. 


(a) What is the general form of the equation for the intersection of a right 
circular cone whose vertex is at (0, 0, 0) with the plane z = 0? 

(b) Give an example of a cone in (a) in which the conic section consists of the 
single point (0, 0, 0). 

(c) Give an example of a cone for which the conic section is the single line 
x = 0. 

(d) Give an example of a cone for which the conic section is the pair of lines 
x = Oandy = 0. 
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There are several other possible ways to define the conic sections. In 
this section we wish to investigate these alternatives. 

It has been known for a long time that if we are given an ellipse, then 
there are two points in the plane such that the sum of the distances from 
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these points to every point of the ellipse is a 
constant. This condition can be written in the 
form: 


IX — F,| + |X — F.| = 2a, 


where F, and Fs are the fixed points, a is a 
constant (2a is used here because it leads to 
certain simplifications later), and X is any point 
of the ellipse. In this form, the two fixed points 
are called the foci of the ellipse. 

A simple proof of this property was dis- 
covered in the early nineteenth century by 
the Belgian mathematician Dandelin. The 
Dandelin proof is illustrated in Fig. 6-7. We 
imagine a cone and an intersecting plane 
forming an ellipse. Every point on the cen- 
tral axis of the cone is equidistant from the Figure 6-7 
generators of the cone (see Problem 1 at the 
end of this section), and hence each point of the central axis is the center 
of a sphere tangent to the sides of the cone. Exactly two of these spheres 
would also be tangent to the plane which cuts off the ellipse as shown in 
Fig. 6-7. The points of tangency, F'; and Fo, are the foci of the ellipse. 
Let P be any point of the ellipse. There is a unique generator of the cone 
through the point P. Let this generator be tangent to the two spheres at 
Cı and C2. Then the distance between C; and C's is a fixed constant, say 2a, 
independent of which point P has been chosen. Now consider the line 
segments FP and F2P. The two segments FP and CP are both tangent 
to the lower sphere from the point P and hence have the same length, 
\F,P| = |C,P|. Similarly, |F2P| = |C2P|, and hence 





We have therefore proved 
Theorem 6-2. For any ellipse, there exist a constant a and two points, 


F, and Fə, which are in the plane of the ellipse, such that if X is any 
point on the ellipse, then 


\XFi| + |XFo| = 2a. (6-8) 


The points F, and Fz are called the foci of the ellipse. 


In an exactly similar manner a focal relation for the hyperbola can be 
obtained. Two points F, and Fo, called the foci of the hyperbola, can be 
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found such that for every point X on the hyperbola, the absolute value of 
the difference of the distances from X to F, and from X to Fo. will be a 
constant. Verification of this will be left to the reader, with the observation 
that in the case of the hyperbola the two spheres are in different nappes 
of the cone. 


Theorem 6-3. For any hyperbola, there exist a constant a and two 
points, F, and Fə, which are in the plane of the hyperbola, such that if 
X is any point on the hyperbola, then 


| |XFil — |XF2| | = 2a. (6-9) 


The points F, and F» are called the foci of the hyperbola. 


Let us look at another characterization of the conic sections which can 
be obtained from the introduction of the sphere of the Dandelin proof. 
This sphere, which is tangent to both the cone and the intersecting plane, 
is called the Dandelin sphere. The circle consisting of those points at which 
the Dandelin sphere is tangent to the cone lies in a plane which is orthog- 
onal to the axis of the cone. (See Problem 1 at the end of this section.) 
In Fig. 6-8 we have indicated this plane together with the cutting plane 
which determines the conic section. Note that these two planes can be 
determined for any of the configurations discussed in section 6-1. That is, 
we are not restricting ourselves to a particular one of the conics (except 
that the circle does not fall under our discussion here). In Fig. 6-9, where 
we have redrawn the important features of the geometric configuration 
which we wish to consider, F is the focus of the conic, that is, the point at 
which the sphere is tangent to the cutting plane. Just as in the discussion 
above, we see that if P is an arbitrary point on the conic, then |FP| = 
ICP| where the line CP is along a generator of the cone and C is the point 
at which this generator is tangent to the sphere. 





FIGURE 6-8 FIGURE 6-9 
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Let DP be the line segment parallel to the axis of the cone such that D 
is on the plane through the circle of tangency. Thus DP is orthogonal to 
this plane. Then if 6 is the half angle of the cone, the angle between CP 
and DP is 6, and hence 


|DP| = |CP| cos 6 = |FP| cos 6. (6-10) 


Let L be the line of intersection of these two planes. Then L is orthogonal 
to the axis of the cone. Let the plane through P orthogonal to L cut this 
line at E. Then DP is in this plane (since DP is parallel to the axis of the 
cone and hence is orthogonal to L) and the triangle DEP is a right triangle. 
Let a be the angle between the two planes and let ¢ = 7/2 — a. This 
angle ¢ is then exactly the angle between the axis of the cone and the 
plane that was discussed in the last section. Observe that the angle ¢ 
is the vertex angle of the triangle DPE at P, and therefore 


|DP| = |PE| cos ¢. (6-11) 


Combining this equation with (6-10), we find 








_ cos d 
|FP| = |PE| ash. 
or, setting 
cos 
pe ee, (6-12) 
we have 
|FP| = |PEle. 


In this relation, the quantity e is known as the eccentricity of the conic 
and the line of intersection of the planes is called the directrix of the conic. 
When the cutting plane is orthogonal to the axis of the cone and thus 
yielding a circle, this formula is not strictly applicable since the directrix 
does not exist. However, in this case we define e = 0. If the cutting plane 
is allowed to tip, e continuously increases while an ellipse is produced. 
When the plane becomes parallel to a generator of the cone, cos ¢ = cos 8, 
e = 1, and we have a parabola. For a hyperbola e > 1. The maximum 
value of e for a given cone is 1/cos 6, but this can be made as large as 
desired by widening the cone and letting 6 get close to 7/2 (it can 
never become 77/2). 

In the case of the ellipse or hyperbola, there are two Dandelin spheres 
and hence two foci. The above argument can be used at either focus. Thus 
the ellipse and hyperbola have two directrices, one associated with each 
focus. The parabola, on the other hand, will have a single directrix and 
focus. 
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Collecting these results, we have the following theorem. 


Theorem 6-4. Let F be a focus of a nondegenerate conic section which 
is not a circle. Then there exists a line L, called a directrix of the conic, 
and a constant e, called the eccentricity of the conic, such that for every 
point X on the conic 

|XF| = |XEle, (6-13) 


where |X| is the distance from X to the line L. The eccentricity e is 
less than 1 for an ellipse equal to 1 for a parabola, and greater than 1 
for a hyperbola. 


PROBLEMS 


1. Let the vertex of a cone be at the origin. Let B be the unit vector defining 
the axis of the cone, 0 be the half angle of the cone and A = cos 0. Let R = 
rB be a point on the axis of the cone, and suppose that u is a unit vector along 
a generator of the cone. 


(a) Let S be the projection of R in the direction of u. Prove that 
S = \ru. 


(b) Show that the distance from R to the generator is |RS| = [1 — A?2]!/2r, 
and hence that R is the same distance from all generators. 
(c) Let T be the projection of S in the direction of B. Show that 


T = dB. 


(d) Prove that TS is orthogonal to B and hence that the points of tangency 
of the sphere with center R and radius | RS] all lie on a plane orthogonal 
to B. 


2. Using the same cone of Problem 1, let M be the plane through the point Q 
orthogonal to the unit vector v. Suppose that Q = kB where k = 0. Sup- 
pose further that v-B = [1 — A2]!/2. 


(a) Prove that the conic section which results is a parabola. 

(b) Show that the distance from the point R of Problem 1 to the plane Af 
is |k — r|[1 — A?]!/*. Prove that the center of the Dandelin sphere for 
this parabola is at (k/2) B. 

(c) At what point is the focus? 

(d) Show that |B X v| = |B X (BX v)| =X. 

(e) Prove that the directrix of the parabola is the line 


X = Eo + (B Xv), 
where 


2 MOMS - AA 
Bo = ths na — 2’) 


x2 Bx ŒX v|: 
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3. Draw a diagram and give the proof of Theorem 6-3 similar to that given for 
Theorem 6-2. 


4. Let a parabola have a focus F and let L’ be a line which cuts the parabola 
and is parallel to the directrix. Let P be any point of the parabola which is 
on the same side of L’ as the directrix. Let PB be the line segment from 
P to L’ which is orthogonal to L’. Prove that |F P| -+ |PB| is a constant, 
independent of P. Make a sketch. 


5. Let a solid be made up, as shown in Fig. 6-10, from a cone cut by a plane 
through its axis and a second plane parallel to a generator so that the line L’ 
of intersection of these two planes is orthogonal to the axis. Let A be the 
vertex; let B be the point at which a generator meets the parabola; and let 
BC be a line segment from B to L’ orthogonal to L’. Prove that |AB| + |BC| 
is a constant, independent of the generator chosen. 


FıGure 6-10 





Remark: This shows that the geodesics (paths of the shortest length) from 
the tip of the cone to the line L’ are all of the same length. If this figure were 
covered by a sheet of explosive, then by starting the explosion at the point A, 
a linear explosive front is produced at L’. The reader can easily verify that 
these paths are indeed geodesics. 


6. The usual method used to draw an ellipse is to stick two tacks into a sheet 
of paper and place a loop of string around these. A curve can then be drawn 
by placing a pencil point into the loop and moving it about, always holding 
the loop tight. (See Fig. 6-11.) Prove that the resulting curve satisfies the 
condition given in Theorem 6-2. 





FIGuRE 6-11 FIGURE 6-12 


7. Let a string be fastened at a point F on a drawing board and at the end Q 
of a T-square (Fig. 6-12). Move the square along the board, holding a pencil 
point on the edge of the square at a point P so that the string is tight from 
F to P and from P to Q. As the square is moved, the point P traces out a 
parabola. Prove that the resulting curve satisfies the condition of Theorem 
6-4 with e = 1. 
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In the previous sections we have seen several properties of the ellipse. 
In section 6-1 we saw that a properly located ellipse would satisfy an 
equation of the form 

u'r? + yy? = k?. (6-14) 


Let us compare this fact with the result of Theorem 6-2, which tells us 
that each point X of the ellipse satisfies the relation 


|XF,| + |XFo| = 2a. (6-15) 


Suppose that the foci are on the z-axis, at equal distances on either side of 
the origin. (Two given points can always be so located by a suitable 
choice of the coordinate system.) Let Fı be the point (—c, 0) and F2 the 
point (c, 0), where c > 0. Then relation (6-15) becomes 


[e + o? + yl? + [(e — ©)? + y?]"? = 2a. 
Any point (x, y) which satisfies this equation must also satisfy 

to tu samea, 
and, squaring, must also satisfy 


£? + 2cx + e? + y? = 4a? — 4af(z — c)? + y?l? + r? — Qcxr 
+e? +g’, 
or 
4a[(x — c)? + y?]"/? = 4a? — 4ez. 


Squaring again after dividing by 4, we eliminate the last radical, giving 


a*[x? — Qex + c? + y?] = at — 2Qa2exr + c?x?, 
or 
(a? — c?)x? + a2y? = at — a?e? 
= a*(a? — c’). 
Dividing through by the right-hand member, we obtain the equation 
2 2 
t y 
a aag l. (6-16) 


Note that since the distance between the foci is 2c, 2a must be greater 
than 2c or the ellipse will not exist. Hence a? — c? > 0. Let us set 


a? — e = b’, (6-17) 
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then we finally have the equation 
a (6-18) 


We have thus shown that any point which satisfies Equation (6-15) 
satisfies Equation (6-18). 

Before continuing with the discussion of (6-18), let us look at the second 
representation of the ellipse as given in Theorem 6-4. This says that there 
are a point F, a line L, and a number e, with 0 < e < 1, such that every 
point X on the ellipse satisfies the equation 


IXF| = |XEle (6-19) 


where |XE| is the distance from X to the line L. 

Let L’ be the line through F, orthogonal to L and suppose that X is a 
point on the line L’. Then, for some scalar t, X = F + (FE, where E is 
the point on L closest to F (and also to X). That is, Æ is the point of 
intersection of L and L’. Hence 


|XF| = |t|- |FE], 
and 
IXE| = |E — X| 
= |E — F — tFE| 
= |FE — tFE| 
= |i — t|- |FE]. 
Therefore, 
XF] d, 
|XEI |1 — tl 


To satisfy requirement (6-19) we must have t/(1 — t) = =e. Since 
0 < e < 1, there are two possible values of ¢ which will satisfy this, namely 
t = e/(1 + e) and ¢ = —e/(1 — e). Therefore, there are two points, A 
and A’, on the line L’ which satisfy the requirement (6-19). 

One of these points, which we denote by A, is between F and L (t is 
between 0 and 1). The other, A’, is so located that F is between A’ and 
L. The reader will find it instructive to observe how the ratio |XF|/|XE| 
behaves as the point X moves along the line L’. 

We are free to locate the focus and directrix as we wish so as to simplify 
our computations (but, of course, maintaining the same distance between 
them). Let us place them so that the line L’ coincides with the x-axis. 
That is, so that F is on the z-axis and the directrix L is orthogonal to the 
z-axis. The two points A and A’ found above will also be on the z-axis, 
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and we will suppose that these points are so located that they are at equal 
distances on either side of the origin. Set 


A = (a,0), A’ = (—a,0), a> 0, 
F = (c, 0), 
L = {(z,y) |z = d}. 
The points A and A’ satisfy relation (6-19). That is, |AF| = |AEle, 


and |A’F| = |A’Ele. Since O < a < d, these are equivalent to the 
equations 


a — c = (d — aje, 
a+ c= (d+ ae. 
Adding these equations gives 
2a = 2de, 


and subtracting them gives 
2c = 2ae. 


Therefore, we may assume that F = (ae, 0) and that the directrix L is 
the line x = a/e, where a is some positive constant. 


FIGURE 6-13 





Let X = (zx, y) be an arbitrary point which satisfies the desired relation 
(6-19) (see Fig. 6-13). Then, we must have l 


a 


[(£x — ae)? + y? = e go ele 








Squaring this gives 


z? — Qaex + a*e? + y? = a? — Qaex + e7x?, 
or 
(1 — e?)x? + y? = a®(1 — e?). 


This equation may be divided through by the right-hand member to give 


2 
x 2 


ae (6-20) 
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This equation is identical to Eq. (6-18) if we set 
b? = a?(1 — e°). (6-21) 


Let us now investigate the rather delicate question of exactly what 
implications have been proved. An ellipse is defined as the point set com- 
mon to a cone and a plane which cuts through a single nappe of the cone. 
In Theorem 6-1 we showed that if the cone and plane were suitably located, 
that is, if a coordinate system is suitably chosen, then a point is on the 
ellipse if and only if it satisfies an equation of the form (6-14). In Section 
6-2, we showed that if a point is on the ellipse, it satisfies an equation of 
the form (6-15) and one of the form (6-19). Here we have shown that if a 
point satisfies an equation of the form (6-15), then it satisfies one of the 
form (6-18) and that if a point satisfies (6-19), then it satisfies an equation 
of the form (6-20). These facts may be summarized in the following dia- 
gram. 


Ellipse 


Dg 


(6-15) (6-19) (6-14) 


N 


(6-18) - - - = (6-20)- - 3 


The solid arrows in this diagram indicate the implications we have proved. 
The dashed double arrow joining (6-18) and (6-20) is meant to indicate 
that a point which satisfies an equation of either form also satisfies the 
other. This is clearly true. We merely use (6-21) to go from the one 
form to the other. 

The final implication indicated in this diagram, the arrow from (6-20) 
to the word “ellipse” is meant to indicate that if a point satisfies an 
equation of the form (6-20), then it lies on an ellipse with the given 
eccentricity e. To show this we merely have to set 


A (ai SS), 


e 
e 1 
8 = hira ae) P 
SOEN OS 
ee) 


in (6-3) to obtain the equation of a conic section in the xy-plane which is 
identical to (6-20). The student is asked to verify this as an exercise at the 
end of this section. 
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With this result, we see that in the above diagram all the implications 
hold, and hence that starting at any point in the diagram and following 
the arrows we can arrive at any other point. In other words, all of these 
characterizations of the ellipse are equivalent. In particular, we have proved 
the next theorem. 


Theorem 6-5. The locus of any equation of the form 
2 2 
x y 
att 52) 
where a > b > 0, is an ellipse.* 


Let us now see what can be determined about the points of an ellipse 
from the equation given in this theorem. We see that if |x| > a, this 
equation cannot be satisfied, so that all points of the ellipse must have zx- 
coordinates between —a and +a. Similarly the y-coordinates of all points 
of the ellipse must be between —b and +b. 

When z = —a, only y = 0 can satisfy the equation. When —a < z < 
a, there will be exactly two values of y satisfying the equation, and hence 
exactly two points of the ellipse with this value for their z-coordinate. 
In particular, when x = 0, y = +b and —b are the two values for y. 
When x = a, there is again only the single value y = 0 which can 
satisfy the equation. 

The ellipse also has a number of symmetry properties. To discuss these 
properties we first need a definition. 


Definition 6-3. Two points P; and P», are symmetric with respect to a 
line L if and only if the line LZ is orthogonal to, and bisects the line 
segment P,P. A set of points S is symmetric with respect to a line L 
if and only if for each point P in S, there is also a point P’ in S such 
that P and P’ are symmetric with respect to the line L. 


This definition corresponds to our usual notion of symmetry. It can 
easily be converted to an analytic condition in special cases. For example, 


Theorem 6-6. If S is the set of all points (x,y) in the plane which 
satisfy a functional relationship f(z y) = 0, and if for every x and y, 
f(x, —y) = f(z, y), then S is symmetric with respect to the z-axis. 
Similarly, if for every z and y, f(—z, y) = f(z, y), then S is symmetric 
with respect to the y-axis. 


* Compare this with Theorem 6-4. 
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It should be noted that this is not an “if and only if” theorem. The 
condition given here gives us what is called a sufficient condition. If it is 
satisfied then the set is symmetric. It is not, however, a necessary condition. 
The locus of f(z, y) = 0 can be symmetric with respect to one of the axes 
without the corresponding condition in this theorem being satisfied. The 
proof of this theorem is left as an exercise. 

Now, the equation of the ellipse satisfies the requirements of the above 
theorem [letting f(z, y) = x?/a? + y?/b? — 1]; hence we can conclude 
that the ellipse is symmetric with respect to both the z- and y-axes. This 
is a most remarkable result. It is “obvious” to most people that if a 
plane intersects a cone at an angle, the figure which results must be egg- 
shaped, fatter at the end which is at the wider part of the cone. Indeed, 
when Albrecht Dürer, one of the great inventors of descriptive geometry, 
discovered an accurate method for the construction of an ellipse in the 
early sixteenth century, he allowed his “knowledge” to affect the con- 
struction so as to obtain an egg-shaped figure. An illustration from one 
of Dürer’s books showing this error can be found on page 614 of Volume 1 
of The World of Mathematics, edited by James R. Newman. 

Let us now collect the information we have found about the ellipse. 
In Fig. 6-14 we show a diagram of the ellipse together with the other 
relationships which have been developed. The dashed rectangle is centered 
at the origin and is made up of the lines x = +a and y = +b. The 
ellipse itself lies within this rectangle, touching the four sides of the 
rectangle at the points where the axes cross the sides. The origin is called 
the center of the ellipse (later we will discuss the case of an ellipse whose 
center is located at a point other than the origin). 





Figure 6-14 


Note that a > b (since b? = a? — c?), hence the rectangle is longer 
in the x-direction than it is in the y-direction. The x-axis (which is the 
line through the two foci) is called the principal axis of the ellipse. The 
y-axis, which 1s the line through the center orthogonal to the principal 
axis, we call the conjugate axis of the ellipse. In traditional usage, these 
axes are called the major and minor axes respectively. These terms will 
not be used here, but the reader should be aware of them. Strictly speaking, 
both of these axes should be called principal axes to conform with modern 
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usage in physics and applied mathematics, but our present terminology 
will suffice. 

The two points at which the principal axis cuts the ellipse [the points 
(—a, 0) and (a, 0) in Fig. 6-14] are called the principal vertices (or just the 
vertices) of the ellipse. The points where the conjugate axis cuts the ellipse 
(points (0, —b) and (0, 6) in this case) are called the conjugate vertices of 
the ellipse. When we refer to a vertex, without specifying the type, we 
mean a principal vertex. 

The quantities a and b are respectively called the principal and con- 
jugate dimensions of the ellipse. In traditional usage, these are called 
the semimajor axis and semiminor axis. 

Let us collect these terms into a formal definition. 


Definition 6-4. Let Fı and Fə be the foci of an ellipse E. Then we define 

the following quantities in the ellipse. 

(1) The center, C, is the midpoint of the line segment FF. 

(2) The principal axis is the line L through F, and Fo. 

(3) The conjugate axis is the line L’ through C orthogonal to L. 

(4) The principal vertices are the points A, and Ag of intersection of E 
and the principal axis L. 

(5) The conjugate vertices are the points Bı and Bo of intersection of E 
with the conjugate axis L’. 

(6) The principal dimension is 


a = |CA,| = |CA,|. 
(7) The conjugate dimension is 
b = |CB,| = |CB,I. 
(8) The focal dimension is 
c = |CF,| = |CF)|. 


The relationships between the quantities listed in this definition and the 
other properties of the ellipse are given by the following theorem. 


Theorem 6-7. Let a, b, and c be the principal, conjugate, and focal 
dimensions of an ellipse with eccentricity e. Let d be the distance from 
the center of the ellipse to either directrix. Then the five quantities 
a, b, c, d, and e are related by the three equations 


a? = b? + ¢?, c= ae, = a/e. (6-23) 
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With one exception, if any two of these three quantities are specified, 
the remaining three are uniquely determined by these relationships. 
These three equations can easily be remembered by keeping Fig. 6-14 
in mind. In this figure, the center, the focus, and the point (0, 6) form the 
vertices of a right triangle whose sides are of length b and c. The hypote- 
nuse of this triangle is, by symmetry, exactly half of the sum of the distances 
from the foci to the point (0, 6), and hence must be a. Thus the Pythago- 
rean relation for this triangle gives the first of the three equations. 

To remember the other two relations, it is only necessary to remember 
that c and d are ae and a/e, and that e < 1. This fact together with a 
recollection of the figure will tell you which is which. 


The above discussion could be repeated step for step interchanging the 
roles of x and y. The result would be an ellipse with equation 


2 2 
T y 
pet ga) 


where as before a > b. The resulting ellipse would have its principal axis 
coinciding with the y-axis, and would appear as in Fig. 6-15. 

The same relations between the quantities a, b, c, d, and e hold for this 
case as well. 

Students often seem to have difficulty in recalling which dimension is a 
and which is b when faced with an actual equation. The following procedure 
is therefore recommended. When an equation of the form 


z2 2 
pS 
is given, set x = 0 and note that y = +q. The two points (0, q) and 
(0, —q) may then be marked on a coordinate plane. Similarly, setting 
y = 0 we find the points (p, 0) and (—p, 0) on the ellipse. Mark these 
points on the plane, and draw the rectangle determined by these four 
points (as in the figures). The ellipse may then be sketched within the 
rectangle and the various other quantities determined with a being the 
larger of p and q. 

For example, if we are given the equation 


2 2 
Tria = 
47 9 "i 


we immediately recognize it as the equation of an ellipse of the type shown 
in Fig. 6-15. In this ellipse, the principal dimension is a = 3 and the 
conjugate dimension isb = 2. The focal dimension is then c = ya? — b2 
= y5. The eccentricity is e = c/a = v5/3, and hence d = a/e = 
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9/./5. We therefore have for this ellipse the 
following properties. 


center: (0, 0) 

eccentricity: e = /5/3 
principal axis: x = 0 

conjugate axis: y = 0 

foci: (0, 5), (0, —V5) 
principal vertices: (0, 3), (0, —3) 
conjugate vertices: (2, 0), (—2, 0) Figure 6-15 
directrices: y = 9x/5/5, y = —9\/5/5 


Going in the other direction, we may wish to find the equation of an 
ellipse satisfying certain given conditions. For example, what is the 
equation of the ellipse with foci at (—3, 0), (8, 0), and having a vertex at 
(5,0)? Note that this must be a principal vertex, since it lies on the line 
through the foci. 

Here, a= 5, c= 3, and hence b? = a? — c? = 25 — 9 = 16. 
Therefore, b = 4 and the equation is 





2 2 
WE Ue ez: 
a 

(How do we know that 25 divides the x? rather than the y??) 


PROBLEMS 


1. Using the given relations between a, b, c, d, and e for the ellipse, find formulas 
for the following: 


(a) b, c, and d in terms of a and e only 
(b) c, d, and e in terms of a and b only 
(c) b, d, and e in terms of a and c only 
(d) b, c, and e in terms of a and d only 
(e) a, d, and e in terms of b and c only 
(f) a, b, and c in terms of b and e only 
(g) a, b, and d in terms of c and e only 
(h) a, b, and e in terms of c and d only 
(i) a, b, and c in terms of d and e only 


2. (a) From the relations given in Theorem 6-7, find an equation which involves 
b, d, and e only. 
(b) Using the result of part (a), show that in any ellipse b < d/2. 
(c) Prove that when b < d/2, there are two different ellipses with the same 
values for b and d. 
(d) If b = d/2, what is the eccentricity of the ellipse? What are a and c? 
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3. Show that exactly the same equation for an ellipse results if the focus- 
directrix form is assumed with the focus at (—ae, 0) and the line x = —a/e 
as the directrix. 


4. For each of the following equations, make a sketch of the ellipse. Give the 
coordinates of the foci, and the principal and conjugate vertices. Give the 
equations of both directrices. Show these on the sketch. Give the eccentricity. 


@ 54h = ] vith Z 
4 E y B 
Ont = | (d) T 
M (f) 32? + 4y7 = 1 
(g) 6x” + 15y? = 60 (h) Z 4 20y = 5. 


5. Give the equation of the ellipse with center at the origin, z-axis as the prin- 
cipal axis, and satisfying the conditions given: 
(a) One (principal) vertex is at (3, 0) and one focus is at (1, 0) 
(b) One vertex is at (3, 0) and a directrix isz = 4 
(c) One vertex is at (3, 0) and the eccentricity is 
(d) One focus is at (4, 0) and the eccentricity is Ẹ 
(e) One focus is at (4, 0) and a directrix isz = 10 
(f) One focus is at (4, 0) and the conjugate dimension is 3 
(g) A directrix is = 5 and the eccentricity is $ 


6. Sketch and give the quantities asked for in Problem 4 for each of the ellipses 
found in Problem 5. 


7. Prove that the cone defined by the quantities A, B, and A given by (6-22) 
intersects the plane z = 0 in the ellipse whose equation, as given by (6-3), 
reduces to (6-20). 


8. Prove Theorem 6-6. 


6-4 THE HYPERBOLA 


We will now derive the standard form for the equation of the hyperbola 
from the focal property given in Theorem 6-3: 


| |XFi| — |XFo9| | = 2a. (6-24) 


In this representation, we will let F; = (c, 0), Fa = (—c, 0), and X = 
(x, y). Then this equation can be written as 


Iæ — o” + y’? — e H eo? H y?l I ț 2a. (6-25) 
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Any pair (x, y) which satisfies this equation will satisfy one of the pair 
of equations 


[æ — 6)? + y?l? — [e + o)? + y?l? = a, 
or equivalently, 
[(@ — co)? + 7)? = [@ + 0)? + y?l"? & 2a. 
Squaring gives 
x? — Qn + c? + y? = x? + Qex + c? + 4a? + 4al(x + c)? + y?]"/?, 


or equivalently, 
F 4aj(x + 0)? + y?l? = 4cx + 4a?. 


Dividing by 4 and squaring again gives 
a?r? + 2a%cx + a?c? + a?y? = e?r? + 2a°cr + at. 
This can be simplified to give 
4 


(c? — a*)x? — a®y? = ac? — at. 


Dividing through by the right-hand member finally gives 


E = ao = 1. (6-26) 
If we set 
c? — a? = b?, (6-27) 
this can then be written as 
z2 y? 
a pl (6-28) 


Every point (x, y) which satisfies (6-24) must therefore satisfy (6-28). 

However, we should check whether or not the definition of b? in (6-27) 
is permissible. The distance between the two foci is 2c. If X is an arbi- 
trary point on the hyperbola, let l be the 
smaller of the two distances |XF,| and 
IXFə|, and m the larger. Then from the 
triangle inequality (Fig. 6-16), we have m 
< l + 2c. Therefore, 


2a = m — l < 2e 


or a < c, which shows that the left-hand 
side of (6-27) is a positive quantity. FIGURE 6-16 
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Next we will derive the equation of the hyperbola from the focus-direc- 
trix property of Theorem 6-10. We assume that we are given a point F, 
a line L, and a real number e > 1. 

Just as in the last section, let L’ be the line through F, orthogonal to L, 
and let E be the intersection of LZ and L’. Then, as before, we find that if 
X is any point of L’, then X = F + iFE, and 


XF d, 
IXE| |l— tl 
To satisfy the property of Theorem 6-10, we must have 
|XF| = |XEle. (6-29) 


Again, (6-29) will hold when t = e/(1 + e) or t = e/(e — 1). These 
two values of t will give us two points, A; and Ag, on L’ which satisfy 
(6-29). These points are on opposite sides of the directrix and one is 
between the directrix and the focus. 

We may now assume that the focus and directrix have been so located 
on the plane that L’ coincides with the z-axis, and A; and Ag are at equal 
distances on either side of the origin. Let A, = (a, 0), Az = (—a, 0), 
F = (c, 0), where a > 0 and c > O, and suppose the directrix has the 
equation x = d. From the locations of A; and Ag in relation to the 
focus and the directrix, we have 


—a<d<ac<e. 


Since the points A; and Á> must satisfy 
(6-29), we find the two equations 


c — a = (a — dje, 
c+ a= (a + de. 


Adding and subtracting these equations 
give 





2c = 2ae 
and 
2a = 2de. FIGuRE 6-17 


Therefore c = ae and d = a/e, just as in the case of an ellipse (except 
nowe > 1). 

Let X = (zx, y) be an arbitrary point on the ellipse. Then (6-29) must 
be satisfied. In this case (see Fig. 6-17), (6-29) is equivalent to 
a 


CS 


[(z — ae)? + y7]'? = e 7 


lex — al. 
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Squaring this relation gives the equivalent equation 


(x — ae)? + y? = (ex — a)?, 
or 
r? — Qaexr + a®e? + y? = e?r? — 2aer + a’. 


This is again equivalent to 


(e? — 1)x? — y? = a?(e? — 1), 
or 
2 2 


Oe fas eet 


which is in the same form as (6-28) if we set 
b? = a?(e? — 1), (6-31) 


(noting that e > 1 and hence e? — 1 > 0). 
We thus have found the equation 


r? y? i 


a2 bR 


for the hyperbola with focus at (c, 0) and directrix x = d, where c = ae 
and d = a/e. Connecting these quantities are the three relations 


a? + b? = c?, 
c= ae, (6-32) 
d = a/e. 


The problem of showing that the forms (6-26) and (6-30) are equivalent 
to the definition will be left as an exercise. 
We therefore have 


Theorem 6-8. Any hyperbola in the zy-plane can be so located that it 
is the locus of an equation of the form 


and the locus of any such equation is a hyperbola. For a hyperbola with 
this equation, relations (6-32) connect the quantities a, b, c, d, and e, 
where e is the eccentricity and c and d are the distances from the origin 
to the focus and directrix respectively. 
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From Eq. (6-28) we see, Just as in the case of the ellipse, that the 
hyperbola is symmetric with respect to both the z-axis and to the y-axis. 
The origin, the point at which these two axes of symmetry intersect, is 
called the center of the hyperbola. 

In Fig. 6-18 we sketch the hyperbola determined by an equation of 
the form (6-28). The four quantities a, b, c, and d are indicated in this 
sketch. The dashed lines outline a rectangle centered at the origin with 
sides of length 2a and 2b just as in the case of an ellipse. The hyperbola, 
however, lies outside of this rectangle. 

The line passing through two foci, in this case the z-axis, we call the 
principal axis of the hyperbola. The two points at which the hyperbola 
crosses this line [the points (a, 0), (—a, 0) in the figure] are called the 
vertices of the hyperbola. The line through the center, orthogonal to the 
principal axis (in this case the y-axis, is called the conjugate axis of the 
hyperbola. Note that there are no conjugate vertices for the hyperbola. 

The quantities a and b are called the principal and conjugate dimensions 
of the hyperbola respectively. 

The two lines shown in Fig. 6-18 forming the diagonals of the dashed 
rectangle have a special relation to the hyperbola. In this figure each 
branch of the hyperbola is shown to lie completely within one of the 
angles formed by these lines and the points of the hyperbola are shown to 
be close to the points of these lines as |z| and |y| become large. The proof 
of Theorem 6-9 will show that this is actually true. These lines are called 
the asymptotes of the hyperbola. 


Definition 6-5. Let F, and Fə be the foci of a hyperbola H. Then we 
define the following terms. 


(1) The center C is the midpoint of the line 
segment Fy Fo. 
(2) The principal azis is the line L through 


F 1 and F 2. mi _ 
(3) The conjugate axis is the line L’ through I\ 
C, orthogonal to L. \ 
(4) The vertices are the points A, and Ag Nm a 
| 


of intersection of H and L. 
(5) The principal dimension is 


a= ICA,| = |C Aol. 
(6) The focal dimension is 


c = |CF,| = |CF3|. Figure 6-18 
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(7) The conjugate dimension is b, where 
b? = c* =a’. 


(8) The asymptotes are the two lines through the center which are the 
diagonals of the rectangle whose sides are parallel to.L and L’, 
whose center is C, and whose dimensions are 2a and 20, the sides of 
length 2b passing through the vertices. 


Theorem 6-9. Each branch of the hyperbola lies completely within one 
of the angles formed by the asymptotes. Points of the hyperbola which 
are far from the center are arbitrarily close to the asymptotes. 


Proof: The asymptotes of the hyperbola (6-28) have equations 
(6-33) 


(why?). To prove our assertions, it suffices to show them to be true for 
the one line x/a — y/b = 0 and for the points of the hyperbola both of 
whose coordinates x and y are greater than zero. By symmetry, the 
results will then hold true for all points of the hyperbola. 

What must be shown is that if (71, y;) is any point of the hyperbola with 
both zı and yı > 0, and if (z1, Y2) is a point on the line z/a — y/b = 0, 
then yə > yı, and also that as zx, becomes large, the distance between 
the point (xı, yı) and the line becomes small. 

Both of these remarks can be proved from the fact that if (x1, yı) is on 
the hyperbola, then 


2 2 
_ Ti Yı f%1 Yı ha aa 6 
a a2? b (2 a)(z D (6-34) 


If (xı, Y2) is on the line x/a — y/b = 0, then 


Pi eee 
a b 0. 


Hence if y; > y2, then we would have 


Pi ea ee ee 
m Ne 


a b— a b 


and the product on the right-hand side of (6-34) would be nonpositive (we 
are supposing xı > 0 and yı > 0, remember). This is a contradiction. 
Therefore, yı < Yz as we wished to show. 
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To see the second assertion, we recall that the distance from a point 
(xı, Yı) to the line x/a — y/b = 0 is given by 


$= (8 me w) 
a? +32 D 


(why?). But from (6-34) we have 


pie _ (2 _ w) sy ~ SOP 2 
[a? ais b?! ? a b [a? ae b?}1/2 (2 1 w) 
a b 
o ab 
zila? + b?) 


ab 


1 
< SS ee ee 
2 2)1/2 T1 

[a* + b] a 
Here we have used the assumption that yı > 0. It is clear that as we 
allow xı to increase, the quantity ô becomes very small. In fact, we can 


make it as small as we desire by choosing x, large enough. 


If the above development is repeated, interchanging the roles of x and 
y, we would obtain an equation of the form 


x yo 

bz qi —]. (6-35) 

The hyperbola determined by this equation is of the type shown in 

Fig. 6-19. The eccentricity and distances from the center to the focus and 

directrix are determined by the same formulas, (6-32), for this hyperbola. 

The principal axis in this case is the y-axis and the conjugate axis is the 
X-axis. 


FicgureE 6-19 





Students sometimes find it difficult to remember which type of 
hyperbola goes with which of the two types of equations. Rather than 
attempting to memorize this information, it is probably easier to proceed 
as follows when occasion arises. Suppose an equation is given, say 
z?/3? — y?/42 = 1. We know that the hyperbola crosses its principal 
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axis at the two vertices, and has no points in common with its conjugate 
axis. Setting z = 0 in the given equation, we see that there are no values 
of y which can satisfy the equation, and hence x = 0 must be the con- 
jugate axis for this hyperbola. The vertices can be found by setting y = 0 
and solving for x. The vertices and the foci are on the principal axis, which 
is the z-axis (y = 0) in this case. The hyperbola is thus of the type shown 
in Fig. 6-18. 

For this hyperbola then, a = 3 and } = 4. Therefore, c = [a? + 
b7]/2 — 5 e = c/a = 5/3, and d = a/e = 2. We can thus list for this 
hyperbola the following properties. 


center: (0, 0) 

eccentricity: e = 5/3 

principal axis: y = 0 

conjugate axis: + = 0 

vertices: (3, 0), (—3, 0) 

foci: (5, 0), (—5, 0) 

directrices: x = 9/5, x = —9/5 

a: a 
asymptotes: 3 a oe 0, 3 + ‘ie 0 
As another illustrative example, let us find the equation of the hyperbola 

whose center is at the origin, which has one focus at (0, 10), and which 
has the line 2x — y = 0 as one of its asymptotes. Here the hyperbola 
must be one whose equation is of the form (6-35). An asymptote to the 
hyperbola of this form is z/b — y/a = 0. This has the slope a/b, while 
the given asymptote has slope 2. Hence we must have 


a? + b? = 100, 
a/b = 2. 


From these equations we find a = 2b and hence 


5b? = 100, 
b? = 20, 
a” = 80. 
Thus the desired equation is 
t yY eee 
20 80 


Note well that unlike the case of the ellipse, a need not be larger than 
b. Any pair of positive values are possible for the principal and conjugate 
dimensions of a hyperbola. 
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PROBLEMS 


1. Show the same equation for the hyperbola results if the focus is assumed to 
be (—ae, 0) and the directrix x = —a/e. 


2. For the hyperbola defined by the given equation, give the principal and 
conjugate dimensions, the coordinates of the vertices and the foci, the equa- 
tions of the directrices and asymptotes, and the eccentricity. Make a sketch 
for each, showing all relevant items. 


A š y? 
(a) 5 367! (b) i oR = 
ne y? y? Pd 
ag. (d i6 16 7 
x” y? 2 2 
(e) -5 = 1+ ig (f) 2 — 4y“ = —1 


(g) z? — 9y?” = 36 


3. Find the equation of the hyperbola with center at the origin satisfying the 
given conditions. 


(a) One vertex at (5, 0) and eccentricity 2 

(b) One vertex at (0, 8) and one focus at (0, 18) 

(c) One focus at (4, 0) and eccentricity 5/2 

(d) One focus at (6, 0) and one directrix being z = 2 

(e) One vertex at (0, 8) and one asymptote being 2x — y = 0 

(f) One directrix being x = 4 and one asymptote being x -+ 4y = 0 


4. Suppose H is the hyperbola determined by the equation 


2 y? 
ap} 
What is the equation of the hyperbola H’ whose principal axis and principal 
dimension are the conjugate axis and conjugate dimension of H and whose 
conjugate axis and conjugate dimension are the principal axis and principal 
dimension of H? What is the relationship between the asymptotes of H 
and H’? These two are called conjugate hyperbolas. 


5. During the second world war sound ranging units were used to locate enemy 
artillery. These units operated as follows. Three sound detectors were 
placed on a straight line which was nearly orthogonal to the direction of the 
gun to be located. When the sound of the gun was picked up, the time 
difference between the arrival of the sound at the right-hand and center 
detectors and the time difference between the arrival of the sound at the 
left-hand and center detectors were determined. From a table, each time 
difference was used to determine an angle. On a map lines were drawn with 
the determined angles through the points half way between the corresponding 
detectors (see Fig. 6-20). The enemy gun was then located as being at the 
point at which these lines intersected. 


6-4 


10. 


x 
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FIGuRE 6-20 FiaureE 6-21 


In practice, the sound ranging units observed that when the enemy gun 
was relatively close, the position located in this manner was usually slightly 
beyond the actual position. 

Explain why the gun is located on the intersection of two hyperbolas. 
For a given pair of detectors and a given time difference at these detectors, 
how could you determine the actual hyperbola? The lines read from the 
tables were the asymptotes of these hyperbolas. Why were these lines used? 
Explain the observed discrepancy.* 


. Prove that a point satisfies an equation of the form (6-26) if and only if 


it satisfies one of the form (6-30). Given a and e > 1, show that the equa- 
tion of the intersection of the zy-plane with the cone having vertex A = 
(0, 0, —av e2 — 1), axis in the direction B = [1, 0, 0], and A = 1/e is 
(6-30). Draw a diagram similar to that in the last section to show that the 
focal properties are equivalent to the definition of the hyperbola. 





. Prove that the following geometric construction locates the focus and the 


directrix of a hyperbola. Let O be the origin, C a corner of the rectangle whose 
sides are 2a and 2b and which is centered at the origin as shown in Fig. 6-21, 
and A the point at which one side of the rectangle crosses the z-axis. (A 
is thus a vertex of the hyperbola.) With center at O draw a circle with 
radius |OC|. This circle cuts the z-axis at F, a focus of the hyperbola. Also 
with center at O, draw a circle with radius |OA|. This circle cuts the asymp- 
tote (the line through O and C) at a point D which is on the directrix. 


. Find a geometric construction similar to that in Problem 7 to locate the 


directrix of an ellipse. 


. To what extent do any two of the five quantities a, b, c, d, and e for a given 


hyperbola determine the other three? Prepare a table for all possible 
determinations. 


Let 6 be the angle between the principal axis and one of the asymptotes of 
a hyperbola whose eccentricity is e. Prove that e = sec @. 


A friend of the author was in such a unit during the second world war, 


and, despite the fact that he became a mathematician, it was not until many 
years later that he realized that his activities had anything to do with the focal 
properties of the hyperbola. 
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6-5 THE PARABOLA 


While we had two different conditions which could be used to determine 
an ellipse or a hyperbola, Theorem 6-4 gives us only one condition which 
can be used to derive the general equation of a parabola. This condition 
says that all points on the parabola are at the same distance from a given 
point, the focus, and a given line, the directrix. 

To derive the equation of a parabola, let us start by assuming that the 
focus F is at the point (p, 0), and that the directrix is the line x = —p. 
Then if P is a point of the parabola with coordinates (z, y), we must have, 


|PE| = |PF\, (6-36) 
where |PE| is the distance from P to the line, and hence 
lz + p| = [£ — p)? + y’. 
This equation can be squared to give the equivalent equation 
x” + 2pr + p? = z? — 2pr + p’ + y’, 


which simplifies to 
y? = 4prz. (6-37) 


This is then the equation of the desired parabola. The set of points which 
satisfy this equation has the general appearance shown in Fig. 6-22. This 
figure is based on the case p > 0. The z-axis, which is the line passing 
through the focus orthogonal to the directrix is the axis of the parabola. 
The parabola has no second axis so that we do not have to speak of this 
as the principal axis, although that is actually what it is. 

The point at which the axis cuts the parabola, the origin in this case, is 
called the vertex of the parabola. The eccentricity is, of course, 1 as is 
necessary to fit in with the focus-directrix form of the definition of the 
ellipse and hyperbola. 





I=e-p 


FIGURE 6-22 Figure 6-23 
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Definition 6-6. Let F be the focus and L’ the directrix of a parabola P. 
Then 


(1) the azis of the parabola is the line L through F orthogonal to L’; 
(2) the vertex of the parabola is the point of intersection of L with P. 


A fairly good sketch of the parabola can be made, as shown in Fig. 6-23, 
by sketching in two rectangles. The smaller rectangle is made up of the 


lines x = 0, x = p, y = 2p, and y = —2p. The larger rectangle, which 
is actually two squares, is made up of the lines x = 0, x = 4p, y = 4p, 
and y = —4p. The parabola then can be sketched in as shown. Essen- 


tially, we use here the fact that if p is the distance from the vertex to the 
focus, the parabola is of “height” 2p at the focus and of “height” 4p at the 
distance 4p from the vertex. 

We observe from Eq. (6-37) that the parabola is symmetric with 
respect to the z-axis, that is with respect to its axis. This is the only axis 
of symmetry for the parabola. 

When p < 0 in Eq. (6-37), the parabola will “open out” to the left 
as in Fig. 6-24 (b). The axis is still the x-axis, however. 


If we interchange the roles of x and y, we get an equation of the form 
2 — Apy. (6-38) 


The appearance of the parabola in this case is as shown in Fig. 6-24(c) 
and (d). 

For any given equation it is easy to determine which type of figure is 
involved. If x, for example, is the squared variable, as in (6-38), and 
if the point (xz, y) is on the parabola, then so is (—z, y). This means 
that the y-axis must be the axis of symmetry. Since z? is always positive, 
4py must be positive also. Hence every point (zx, y) on the parabola must 
be such that y is of the same sign as p. In this way, any given equation 
of this type can be analyzed. Conversely, the type of equation required 
for a given parabola is easily determined. 


St Oe 7s 


y? =4pxr y? =4px x? =4py z? =4py 
p>o0 £ <0 p>0 p<0 
(a) (b) (c) (d) 


FIGURE 6-24 
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For example, 
z* + 8y = 0 


is the equation of a parabola. Rewriting this as 
r E 4(—2)y, 


we recognize it as a parabola of the type shown in Fig. 6-24(d). Its 
focus is at. (0, —2) and its directrix is the line y = 2. 

Similarly, the equation of the parabola with vertex at (0, 0) and focus 
(—3, 0) must be 


y? = 4(—3)zx 
=>. 127. 
PROBLEMS 


1. For each of the following parabolas, give the coordinates of the focus and the 
equations of the axis and the directrix. Sketch the parabola. 


(a) z? = 16y (b) y2? = —6z 

(c) z? + 2y = 0 (d) x — 12y? = 0 
(e) 4r? + 5y = 0 (f) 22+ 15y? = 0 
(g) 4r? = y (h) 3z? — 8y = 0 


2. Find the equation of each of the parabolas with vertex at (0, 0) and satisfying 
the following conditions. 


(a) focus at (2, 0) (b) focus at (—3, 0) 
(c) focus at (4, 0) (d) focus at (0, —2) 
(e) focus at (0, —4) (f) focus at (0, 5) 

(g) directrix t = —4 (h) directrix x = 7 
(i) directrix y = (j) directrix y = —2 


3. Show that the cone with 
A = 1/V2, B = [1/V2,0,1/V2], and A = (0,0, —2p) 
intersects the plane z = 0 in the parabola with equation 
y? = 4pr. 


4. (a) Using directly the definition of the parabola as given by (6-36), find the 
equation of the parabola with focus at (4, 3) and directrix 4x + 3y + 


25 = 0. 
(b) Find the equation of the parabola with vertex at (0, 0) and focus at 
(1, —1). 


(c) Find the equation of the parabola with directrix x + y = 0 and vertex 
at (0, 2). 
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6-6 GENERAL QUADRATIC EQUATIONS WITHOUT 
CROSS PRODUCT TERMS 


If we make a translation of all of the points of the plane through a 
distance h in the z-direction and a distance k in the y-direction, then the 
origin is translated to the point (h, k) and a point which has coordinates 
(x’, y’) 1s translated to the point with coordinates (z’ +h, y’ + k). 
Suppose that (z, y) is an arbitrary point of the translated plane, and that 
(x’, y’) is the point which is translated into (x, y). Then x = x’ + hand 
y=y +k, or 

zs = £ — h, (6-39) 


, 


y — k. 


Now suppose we have a point set which we wish to translate. As a 
specific example, suppose we are considering the points of an ellipse 
(see Fig. 6-25). That is, we have a set of points (x', y’) which satisfy the 


equation 


q!? Ti 


a2 b2 


After the translation the coordinates (x, y) of the points of the translated 
ellipse must satisfy the equation 


(x — h)* 
a2 


uk) om 


+4 1. (6-40) 


In other words, (6-40) is the equation of the ellipse after translation. 


4 
l 


(x, y) 


agp 


(z, y’) | 





FIGURE 6-25 


It is sometimes useful to think of a translation in terms of the change 
of position of the coordinate axes. Thus we may first think of a translation 
as being a function which maps the points of the x’y’-plane to the points 
of the zy-plane, which we think of as an entirely separate plane. The 
resulting loci in their respective planes are identical except for their 
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g’ 





FıGuReE 6-26 


location with respect to the coordinate axes. In Fig. 6-26 we have illus- 
trated this point of view. Note, however, that we can reconstruct the 
left-hand side of this figure if we view the dashed lines on the right-hand 
side as the x’- and y’-axes. This fact may be clarified, perhaps, by looking 
at some specific examples. 

The first type of problem we investigate is the problem of finding the 
equation of a locus in a nonstandard position. For example, what is the 
equation of the parabola with focus (1, 6) and vertex (—3, 6)? Plotting 
these points, as in Fig. 6-27, we see that the axis of the parabola must be 
the line y = 6, which is shown as a broken line. Adding the line x = —1, 
and thinking of these two lines as the z’- and y’-axes, we see that the 
desired parabola would have an equation of the form 


y*—= 4-42’ 


in the z’y’-plane. However, the z-, y- and x’-, y’-axes are related by the 
equations 

e=2+3, 

y = y — ô. 


Recall that these equations are most easily determined by observing that 
the point x’ = 0, y’ = 0 must be the same as x = —3, y = 6. Inserting 
these values into the equation of the parabola, we find the desired equation 


to be 
(y — 6)? = 16(x + 3). 


Another type of problem which often occurs is that of identifying the 
locus of a given equation. Again it is often easiest to simplify the equation 
by a change of the variables such as given by Eqs. (6-39), and view the 
new equation as the equation of the same locus in terms of a new co- 
ordinate system. Thus, for example, if we are given an equation such as 
(6-40), we can introduce new coordinates x’ and y’, satisfying (6-39) so 
as to simplify the equation. From (6-39) we see that the new axes, x’ = 0 
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and y’ = 0, are the lines x = h and y = k. The quantities, points, and 
lines associated with the conic in the new coordinate system can then be 
identified and located in terms of the original coordinate system. 

The ellipse whose equation is (6-40) has a focal distance c = [a? — 
b7]!/2\ and eccentricity c/a, directrix distance d = a/e, but its center is 
at the point (h, k). Its principal axis is the x’-axis, or, in other words, 
y = k. Its conjugate axis is the line x = h. Note that the center is the 
point at which the left-hand side of (6-40) vanishes, and the axes are the 
lines parallel to the coordinate axes through this center. The quantity c 
is the distance between the center and the foci, and hence the foci of the 
ellipse determined by (6-40) are the points (h + c, k) and (h — c, k). 
Similarly, we see that the vertices are the points (h + a, k) and (h — a, k). 
The directrices will be the lines x = h + d and z = h — d. 





FIGURE 6-27 FIGURE 6-28 


Exactly similar reasoning will apply to the translation of any of the 
conic sections discussed in the previous sections. An equation such as 


(x — 3)" (y+2)*_, 
4 9 


is easily identified as the equation of a hyperbola with center at (3, —2). 
Its asymptotes are the lines 
t—3 , y+2 _ 


2 ~*~" 3 





0, 


as may be discovered directly from the equation, thinking of the new 
coordinate system as defined by Eqs. (6-39), or equivalently, found 
with the help of a sketch as in Fig. 6-28. It is seen that one asymptote is 
the line passing through (3, —2) and (5, 1) while the other asymptote 


216 THE CONIC SECTIONS 6—6 


is the line through (3, —2) and (5, —5). The other quantities connected 
with this hyperbola can be determined in a manner similar to that dis- 
cussed above. The methods are best learned by doing a few examples, 
such as the problems at the end of this section. 


Now, we wish to consider the general quadratic equation 
Ag? + Bry + Cy? + Dr + Ey +F = 0, (6-41) 


and to discuss the possible point sets which can satisfy such an equation. 
In this section, we will restrict ourselves to a discussion of equations of the 
type (6-41) in which B = 0, that is, to general quadratic equations with- 
out cross product terms. 

Suppose that in such an equation, 


Ax? + Cy? + Dz + Ey +F = 0, (6-42) 


both A and C are zero (strictly speaking, we do not then have a quadratic 
equation). The equation is then seen to be the equation of a straight line. 
A straight line is actually a degenerate case of a conic section (see Problem 
2 at the end of this section), but we are more interested in the cases in 
which (6-42) is a true quadratic equation, where A and C are not both 
zero. 

First, let us consider the case in which A = 0,andC = E = 0. Equa- 
tion (6-42) is then of the form 


Az? + Dr +F = 0, 


a simple quadratic equation in x. If this quadratic equation has no roots, 
then there are no points satisfying the equation. If it has a single root r, 
then the line x = r is the set of all points satisfying the given equation. 
If there are two distinct roots, then two parallel lines constitute the set of 
points satisfying Eq. (6-42). This last locus is an especially degenerate 
case since it cannot be obtained by the intersection of a cone and plane 
in any way, but requires the cone to degenerate into a cylinder. 

The discussion of the case when (6-42) reduces to a quadratic equation 
in y alone is similar. 

Now suppose A =Æ 0, C = 0, and E = 0. Then, by completing the 
square, the Dz term can be combined with the Ax”, and the Ey term can 
be combined with the resulting constant term to give an equation of the 


form 
A(x — h)? + Ely — k) = 0. 


This is easily recognized as the equation of a translated parabola. 
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A similar discussion shows that when A = 0, C ¥ 0, and D = 0, 
Eq. (6-42) again determines a parabola. 

We are left then with only the cases in which A = 0 and C = 0. By 
completing the square, Eq. (6-42) then can always be reduced to an 
equation of the form 


A(z — h)? +Cly — k)? = m, (6-43) 
where m is some constant. Again, we may consider various cases. 


First, if A = C in (6-483), then this is the equation of a circle when 
m = 0 and is of the same sign as A and C. If m = 0, only the single 
point (h, k) satisfies the equation, and if m ¥ 0 but is of opposite sign to 
A and C, then there are no points which satisfy (6—43). 


Secondly, if A = C, but both are of the same sign, then (6-43) deter- 
mines a translated ellipse, a single point, or has no locus at all depending 
on whether m is of the same sign as A and C, zero, or of opposite sign. 


Finally, if A and C are of opposite signs, then (6-43) determines a 
translated hyperbola when m = 0. If m = 0, however, the set of points 
satisfying (6—43) is a pair of intersecting straight lines, which are the 
asymptotes of the hyperbolas determined when m = 0. 


Theorem 6-10. The locus of Eq. (6—42) is one of the following: 


(1) the empty set 

(2) a single point 

(3) a single line, parallel to one of the coordinate axes 

(4) two parallel lines, parallel to one of the coordinate axes 

(5) two intersecting lines whose slopes are the negatives of one another 

(6) a circle 

(7) a nondegenerate conic section whose principal axis is parallel to 
one of the coordinate axes. | 


This theorem has been proved, except for a few details, in the discussion 
given above. These remaining details are left as exercises. 

The methods used in the above discussion can be applied in practice to 
actually determine the set of points satisfied by any equation of the form 
(6-42). The trick is merely to complete the square on each variable whose 
squared term appears. 

For example, what is the locus of 


xr? + 4y? + 2x — 24y + 33 = 0? 
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To answer this question, we proceed as follows: 


r? +2r+ 4(y? — by) —33, 
xr? + 2x + 1 + 4(y? — 6y + 9) = —33 + 1 + 36, 


(cx + 1)? + 4(y — 3)? = 4, 
(c +1) u- 3? _ 
eye a age rere 


This we recognize as an ellipse with center at (—1, 3) and principal axis 
parallel to the z-axis. Here, a = 2, b = 1, c = v3, e = V3/2, and 
d = 4/43 = 4V3/3. Thus for this ellipse, we have the following: 


center: (—1, 3) 

eccentricity: e = v3/2 

principal axis: y = 3 

conjugate axis: rt = —1 

foci: (—1 + v3, 3), (—1 — v3, 3) 

principal vertices: (1, 3), (—3, 3) 

conjugate vertices: (—1, 4), (—1, 2) 

directrices: x = —1 + 4V3/3, x = —1 — 43/3 


We remark that it is usually helpful to make a sketch as an aid in deter- 
mining these quantities. 


PROBLEMS 


1. Sketch each of the following conic sections. Identify each; give the coordinates 
of the center, foci, and vertices, and the equations of the axes, directrices, and 
asymptotes (if any). What is the eccentricity of each? 


(z — 1) (y — 8) 


ae ee al 
(b) @— 3)? = 16 + 2) 
(c) (x + 1) 4 -9 =1 


(d) 22 — 27+ 8y + 41 = 0 

(e) 1622 + 25y? — 962 + 50y — 231 = 0 
(f) 22 — 4y? — 1474+ 45 = 0 

(g) y? — 6 — 4y+7=0 

(h) z? — 4y? + 62 —4y+4=0 

(i) 9r? + 9y? — 54r + 6y + 66 = 0 

(j) 4r? — 25y? + 40r + 50y +75 = 0 


2. Describe how the intersection of a cone and a plane can be: (a) a single 
straight line, (b) two intersecting lines, (c) a single point. 


6-6 QUADRATIC EQUATIONS WITHOUT CROSS PRODUCT 219 


3. Find the equation of the conic section determined in each of the following 
cases: 
(a) The ellipse with foci at (3, —2) and (3, 6), and with principal dimension 5 
(b) The hyperbola of eccentricity 2 with a focus at (4, 1) and directrix 

(associated with this focus) x = 1 

(c) The parabola with focus at (7, 6) and directrix y = 4 
(d) The ellipse with center at (3, 5), a focus at (3, 8) and a vertex at (3, 9) 
(e) The hyperbola with asymptotes 


4z + y — 11 
4r — y — 13 


0, 
0, 


and a vertex at (3, 1) 


4. Find the equation of the conic section having eccentricity e, a focus at (1, 0) 
and an associated vertex at (0, 0). What happens to this equation and 
its locus as e approaches 0? as e approaches 1 from below? from above? 


5. What are the possible loci of (6—42) if 
(a) AC > 0? 
(b) AC = 0? 
(c) AC < 0? 
6. Show that the lines of parts (3) and (4) of Theorem 6-10 must be parallel to 
one of the coordinate axes. 


7. Prove that when case (5) occurs in Theorem 6-10, the two lines have slopes 
which are negatives of one another. 


8. Give a specific example illustrating each of the first five cases of Theorem 6-10. 


( 


Quadratic Curves 
and Surfaces 


7-1 ROTATION OF AXES 


In the last section we considered the effect of translations upon loci in 
the plane. We now wish to consider the effect of another euclidean motion, 
rotation. 

We will consider only the case of rotations about the origin. Rotations 
about any other center can be studied by translating the center of rotation 
to the origin, performing the rotation, and translating the center back 
to the original position. 

As before, we can think of this euclidean motion as a function which 
maps the points of the ry-plane onto the points of a (separate) x’y’-plane. 
However, the relative positions of points are unchanged, and this mapping 
can be illustrated by drawing two sets of coordinate axes in the same 
plane. A given point will have different coordinates with respect to the 
different axes, as in Fig. 7-1. 

Theorem 5-13 (on page 160) tells us how to determine the coordinates 
of the point in the new coordinate system. Let ej and ez be the unit 
vectors in the directions of the new 2z’- and y’-axes. Then from Theorem 
5-13, we have for any point X 


X = (X: e1)e1 + (X e2)ez 
= xe, + y’eb. 


Hence the point has coordinates (x’, y’) with respect to the new coordinate 
system, where 
E-e zat 
y =x C1, (7-1) 
y’ = X. eb. 


To find the new coordinates all we have to do is compute these dot 
products. But we must first know ej and ez. Let us suppose that the new 
coordinate system is located so that the x’-axis makes a signed angle of 0 
with the original z-axis, as in Fig. 7-1. Then the signed angle from the 
y-axis to the z’-axis is 6 — 7/2 (why?), and the signed angles from the 
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x- and y-axes to the y’-axis are 6 + 7/2 and 0 respectively. Therefore, 


e} : €; = cos 0, 

e] €z = cos (0 — mt/2) = sin 9, 
eb: e; = cos (0 + 7/2) = —sin 9, 
e5-@. = cos 0. 


Using these relations and Theorem 5-13 again [e; = (e;-e,)e; + 
(ej - €2)e2, etc.], we have 
e] = cos Oe, + sin 0 eo, 


. (7-2) 
e5 = —sin 0e, + cos l ev. 


For a given point X = (z, y) in the original coordinate system, we can 
use relations (7-2) to obtain the change of coordinates in (7-1). Doing 
so, we find 
X-e, = (xe, + yee) - (cos de, + sin 6 ee) 
= q cos 0 + ysin 9, 
and 
X- e2 = (xe; + yez) - (—sin 0 eı + cos 0 e2) 
= —zr sin 0 + y cos 98. 


We have thus proved the theorem below. 


Theorem 7-1. If a point (x, y) has coordinates (x’, y’) in a new co- 
ordinate system which has been rotated through the signed angle 9, 
then 


E E) cop + ysin 9, (7-3) 
y’ = —z sin 0 + y cos 8. 


Let us illustrate the use of these formulas by finding the equation of the 
hyperbola whose principal axis is the line y = zx, whose asymptotes are 





FiagurRE 7-2 


FIGuRE 7-1 
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the lines x = 0 and y = 0, and which has a vertex at the point (1, 1) 
(Fig. 7-2). Introduce a new coordinate system whose 2’-axis is the principal 
axis of this hyperbola. Then, from relations (7-3) we have 


1 
x’ = — (x + y), 
V2 (7-4) 


I 
Fane f 

y ae (—xz + y). 

The line z = 0 is the line x’ — y’ = 0 in these new coordinates (why’?), 
and the other asymptote is the line x’ + y’ = 0. These can be obtained 
by solving the relations (7—4) for x and y in terms of x’ and y’. The vertex 
is at the point (+/2, 0) in terms of these coordinates. 
The equation of the required hyperbola in the 
(x’,y’)-coordinate system is therefore 





ae 


From (7-4) this is 
a(x + y)* — #(—2 + y)? = 1, 
or after some simplification, 


zy = 1. (7-5) 


FIGURE 7-3 


In the next section we will be looking at the opposite problem: how to 
find a rotation which will simplify a given equation. For this, we will 
find it useful to have some relations which are consequences of the trigono- 
metric sum formulas. These were derived in Section 2-7, but we can 
repeat the derivation here. Indeed, these formulas can be derived by using 
the methods of this section. Let us suppose that the vector ej results 
from a rotation of e, through an angle @, and that e/’ results from a rotation 
of e, through an angle 6+ ¢, or equivalently, from the rotation of e; 
through an angle of @ (see Fig. 7-3). Then 


ey = cos (0 + $)eı + sin (0 + ¢)ee 
= cos ġej + sin geo. 
However, ej and ez are given by Eqs. (7-2), and substituting these values 


into the second equation above, we have 


cos (0 + ¢)e; + sin (0 + $)ez = cos ¢(cos ĝe, -+ sin 6e2) 
+ sin ġ(—sin ĝe, + cos lez) 
= (cos 0 cos ġ — sin 0 sin d)e; 
+ (sin 0 cos ġ + cos 0 sin ¢)ez2. 
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Since the two sides of this equation are identical, we have proved 


cos (0 + ¢) = cos 0 cos ¢ — sin ô sin ¢, (7-6) 
sin (6 + ¢) = sin @cos ¢ + cos 0 sin ¢. 


PROBLEMS 


1. Find formulas similar to (7-2) for e; and ee in terms of e; and eż. 
2. Use the results of Problem 1 to obtain formulas for x and y in terms of x’ and y’. 


3. Find z and y in terms of 2’ and y’ by solving Eqs. (7-3). How does this 
result compare with the answer to Problem 2? 


4. Find the equation of the hyperbola whose principal and conjugate dimensions 
are a and b, whose center is at the origin, having the y-axis as an asymptote, 
and such that a vertex is in the first quadrant. Can you solve the resulting 
equation for y as a function of x? 


5. Find the equation of the ellipse whose foci are at (3, 4) and (—3, —4) and 
whose principal dimension is 13. Use the methods of this section. 


6. Find the equation of the ellipse of Problem 5 directly from the focal property, 
as was done in Section 6-3. Compare with the result of Problem 5. Which 
method is easier? 


7. Starting from Equations (7-6) prove the following: 


sin 20 = 2 sin @ cos 0 

cos 20 = cos? 0 — sin” 6 
cos 20 = 2 cos? 0 — 1 
cos 20 = 1 — 2 sin? 0 
cos” 0 = [1 + cos 20] 
sin? 0 = 4[1 — cos 26] 


29 _ 1 — cos 26 
= 1 + cos 29 
1 — cos 20 
nee sin 26 
sin 26 
a ca 1+ cos 20 
8. Prove: 
_ tan6é-+ tang 
eee) 1 — tan 0 tan ġ 
ane =e) tan 6 — tang 


1 + tan 6 tan @ 
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7-2 GENERAL QUADRATIC EQUATIONS 


In the first section of the last chapter we found that the intersection of a 
cone with the zy-plane is the set of points which satisfy a quadratic equa- 
tion, that is, it is the locus of an equation of the form 


Ax? + Bry + Cy? + Dz + Ey + F = 0. (7-7) 


In this section we wish to investigate the converse property, i.e. to determine 
what point sets will satisfy an arbitrary equation of the form (7-7). 

If B = 0 in Eq. (7-7), then we know that the equation represents a 
conic section, a degenerate conic section, or the empty set, as was dis- 
cussed in Section 6-6. We will see that the same result holds for Eq. (7-7) 
by showing that in a suitably rotated coordinate system the cross product 
term xy will not occur, and hence in this coordinate system the results of 
Section 6-6 will apply. 

Let us suppose that the (x’,y’)-coordinate system is rotated through an 
angle 6 with respect to the z-, y-axes. Then from (7-3) we have 


x! 


x cos 0 + ysin 9, 


/ 


y’ = —rzr sin 0 + y cos 9. 


If these relations are solved for x and y in terms of x’ and y’ (see Problems 
2 and 3 in the last section) we find 


x = x' cos 0 — y’ sin 0, (7-8) 
y = x' sin 0 + y’ cos 0, 
and hence, we can compute 
2 — x’? cos? 0 — 2x'y' cos 0 sin 0 + y’? sin? 9, 
y? = x’? sin? 0 + 2x'y' cos 0 sin 0 + y’? cos? 0, 


zy = x’? cos 0 sin 0 + z'y' (cos? 0 — sin? 0) — y’? cos 0 sin 0. 


Therefore, in terms of the rotated coordinate system, the point set satis- 
fying Eq. (7-7) will satisfy the equation 
A'r? + B’x'y’ + Cly!? + D'r’ + E’y’ + F! sa 0, 

where 

A’ = A cos? 0 + B cos 9 sin 0 + C sin? 0, 

B’ = 2(C — A) cos 0 sin 0 + B(cos? 0 — sin? 0), 

C’ = A sin? 0 — B cos 9 sin 0 + C cos? 90, (7-9) 

D’ = D cos 0 + E sin 9, 

E’ = —D sin 0 + E cos 9, 

P-E: 
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Using the relations of Problem 7 of the last section, we see that 
B’ = (C — A) sin 26 + B cos 20. 


By proper choice of 6, this expression can be made zero. Indeed, if we 
set 





sin 26 = F, 
(7-10) 
cos 20 = eae, 
A 
where 
A = [B? + (C — A)*]*?, (7-11) 
then B’ = 0. 


The values on the right-hand side of Eqs. (7-10) are the sine and cosine 
of some angle 26 since the sum of their squares is one. The only possible 
way in which this could fail is if A = 0. However, in order for A to be 
zero, we would have to have B = 0 and C — A = 0. Since we were 
trying to eliminate B from (7-7) by this process, this would mean that 
Eq. (7-7) was already in the desired form to begin with. It is interesting 
to note that in this case, Eq. (7-7) represents a circle (if its locus is non- 
degenerate). 

Using the identities of Problem 7 of the last section, we find from 
(7-10) that 


P T E en a Dey. Fea: 
cos s= i(i )= 4u (C — A)], 








sin? 9 = 3(1 + c= 4) = zrla + (C — A), (7-12) 


cos 6 sin @ = B/2A. 


Putting these values into (7-9), we find that when the angle @ is deter- 
mined by (7-10) we have 


A’ = $C + A) + Al, 
B’ = 0, (7-13) 
C’ = 4{(C + A) — Al. 


The values for D’ and E’ are similarly determined by using the values 
determined by (7-12). There is here, however, a matter of choice. Equa- 
tions (7-12) determine cos? 6 and sin? 6. There are in general four possible 
angles 9 between 0 and 27 which will satisfy these requirements, but only 
two of these will also satisfy cos 6 sin 6 = B/2A. Generally, it probably is 
most convenient to choose the angle in the first or fourth quadrant, which 
corresponds to taking the positive root for cos 6, and the appropriate root 
for sin 0 so that sin 0 cos 0 = B/2A. 
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In practical problems, it probably is unwise to try to memorize these 
formulas. If Eqs. (7-13) cannot be referred to, it is better to start with 
Eqs. (7-8) and work the relations out in full. An example will show how 
this may be done. Let us attempt to find the point set which satisfies 
the equation 

9x? — 6ry + 17y? — 288 = 0. 


We set x = 2x’ cos 6 — y’ sin 0 and y = 2’ sin6@ + y’ cos 8, getting 
Q[x’? cos? @ — 2zxr'y’ cos 0 sin 0 + y’? sin? 6] — 6[zx’? cos 0 sin 0 
+ x'y' (cos? 0 — sin? 0) + y’? cos @ sin 6] + 17[zx’? sin? 6 


+ 2x'y’ cos 0 sin 0 + y’? cos? 0] — 288 = 0, 
or 


[9 cos? 0 — 6 cos 0 sin 0 + 17 sin? 6]x’? + [16 cos 0 sin 0 — 6(cos? 6 — 
sin? 6)]z'y’ + [9 sin? 6 — 6 cos @sin 6 + 17 cos? 0]y’? — 288 = 0. 
The expression which we wish to eliminate is 
16 cos @ sin 6 — 6(cos? 6 — sin? 6) = 8 sin 20 — 6 cos 26 
2[4 sin 26 — 3 cos 26]. 


This will be zero if we set 
sin 20 = 2, 
cos 26 = 4. 


In this case, 
cos 0 sin 0 = 4 sin 20 = $, 


cos? 0 = 4(1 + cos 26) = $, 
sin? 0 = 4(1 — cos 26) = 7. 
Substituting in these values, we have the equation 


8r’? + 18y’? — 288 = 0, 
or 


r? 


y” 
36 F i6 ~ 1. 
This we immediately recognize as an ellipse with the z’-axis as its principal 
axis and with principal and conjugate dimensions 6 and 4 respectively. 

Since sin 6 cos 6 is positive, we can choose the positive roots for sin @ 
and cos 6, giving cos @ = 3/1/10 and sin@ = 1/\/10. The 2z’-axis can 
thus be sketched in by drawing the line through the origin and (3, 1). It 
is then easy to make a sketch of the ellipse as in Fig. 7-4. How have the 
coordinates given in Fig. 7-4 been determined? What are the coordinates 
of the foci? 
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\ FIGURE 7-4 


In order to see whether a given equation of the form (7-7) represents 
an ellipse, hyperbola, or parabola, it is not necessary to do all of this 
computation. We will show that the expression B? — 4AC remains un- 
changed under rotation of coordinates, hence this expression will be the 
same as —4A’C” when the coordinates are so chosen that B’ = 0. But the 
equation 

A'r’? + Cy’? + D'r + E'y i F’ — 0 


will represent an ellipse if A’C’ > 0, a parabola if A’C’ = 0, and a hyper- 
bola if A’C’ < 0 (without considering degenerate cases, or the possibility 
of no locus). First, however, we must prove the invariance of this ex- 
pression. 


Theorem 7-2. If the equation 


Az? + Bry + Cy? + Dr + Fy +F=0 
becomes 


A'r’? + B’x'y' at C'y!? + D’ =f E’y! ae F’ = 0 
under a rotation of the coordinate system, then 
C+ A= C +A, 
B”? + (Œ — A)? = B? + © — A)’, 


and 
B? — 4A'Œ = B? — 44C. 


Proof: First we observe from the relations (7-9) that 


C’ + A’ = A(cos? 6 + sin? 6) + C(cos? 6 + sin? 6) 
C+A, 


so that C + A is invariant under the rotation. 


228 QUADRATIC CURVES AND SURFACES 7-2 


Next we compute 


Cc’ — A’ = (C — A)(cos? 6 — sin? 0) — 2B cos @sin 0 
= (C — A) cos 26 — Bsin 20. 
Likewise, 
B’ = (CC — A) sin 20 + B cos 20. 
Therefore, 
Bi? + (C’ — A’)?'= (C — A)* sin? 20 + 2B(C — A) sin 20 cos 20 
+ B? cos? 26 + (C — A)? cos? 20 


— 2B(C — A) sin 26 cos 26 + B? sin? 26 


= B?+ (€ — A)? 


Hence the expression B? + (C — A)?, which is A’, is also invariant under 
the rotation. 
Finally, we merely need to observe that 


B’? = 4A'C’ Bes B’? + (C’ it A’)? S (C’ + A’)? 
= BP aC =A) = (CA)? 
= B* — 4AC. 


We immediately obtain the following theorem as a corollary. 


Theorem 7-3. The set of points satisfying the equation 
Az? + Bry + Cy? + Dr + Fy + F=0 


is an ellipse, a point, or an empty set if B? — 4AC < 0. It isa parabola, 
two parallel lines, or a line if B®? — 4AC = 0. It is a hyperbola or a 
pair of intersecting lines if B? — 4AC > 0. 


For example, the equation that we considered above, 
9r? — 6zy + 17y? — 288 = 0, 
was found to be the equation of an ellipse. For this equation, 
B? — 4AC = 36 — 4-9-17 < 0, 


and hence this information could have been obtained from an application 
of Theorem 7-3. 

It is relatively easy to remember that B? — 4AC remains unchanged 
when the coordinate system is rotated. If the student can also remember 
that C + A is invariant under the rotation, then he knows all he needs 
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in order to obtain the coefficients of the transformed equation. We wish 
to rotate the coordinate system so that B’ = 0. If this is done, we will 
then have the relations 

—4A'C’ = B? — 44C, 

C’+ A’ =C+ A. 


Knowing A, B, and C, we can solve these equations for A’ and C”. 
Again using the above example, 


9x? — 6ry + 17y? — 288 = 0, 
we have 
—4A'C’ = —576, 
Cc’ + A’ = 26. 


From the first equation we have A’ = 144/C". Substituting this into the 
second equation gives C’ + 144/C’ = 26, or 


Cc’? — 26C’ + 144 = 0. 


This gives C’ = 18 or 8, and hence A’ = 8 or 18, respectively. If we use 
the first pair, we find the transformed equation 


8x’? + 18y’? — 288 = 0. 


This process gives us the transformed equation, but it does not tell us 
the angle through which the coordinate system has been rotated. The 
latter can be obtained in another way, however. The last equation tells us 
that we have an ellipse whose principal dimension is 6 and whose conjugate 
dimension is 4. The center is at the origin, which is unchanged by the 
rotation. Hence, if we could find the coordinates of a principal vertex 
in the (xz,y)-coordinate system, then we could locate the ellipse as in Fig. 7-4. 

The coordinates of a principal vertex can, however, be found easily 
by finding a point on the locus whose distance from the origin is 6 (the 
same process will work equally well in the case of a hyperbola). That is, 
we wish to solve the pair of equations, 


9z? — 6ry + 17y? — 288 = 0, 
z? + y? = 36, 


simultaneously. Multiplying the second equation by 9 and subtracting 
gives 
—6ry + 8y? = —36, 
and hence, 
_ 4y? — 18 


_ 4y 
3y = 3T 


x 


eio 
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Combining this with x? + y? = 36 gives 


16 36 
gy tbtaty = 36, 
or 
25 y* — 180 y? + 324 = 0. 


A positive root of this equation is y? = 38. From this we have 
r? = 36 — y? = 334, If we insert these values into the original equa- 
tion of the locus, we find 


648 


This is satisfied when we take the roots for x and y to have the same 
sign. Thus, we can choose the positive roots, and find the point (18/V10, 
6/+/10) to be one of the principal vertices. 


PROBLEMS 


1. Prove that the relations of (7-13) follow from (7-9) and (7-12). 


2. Show that there are in general four possible angles @ between 0 and 27 such 
that 
(C — A) sin 26+ Bcos 26 = 0. 


How are these four angles related to each other? 


3. For each of the following equations, find and sketch the locus, first by using 
relations (7-13) directly, and then by working out the relations in full, 
starting from Eqs. (7-8). 


(a) 9x? + 24ry + 16y? + 100x — 40y + 100 = 0 
(b) 522 + 8ry + 5y? —9 = 0 

(c) z? + 2/3 ay — y2+1=0 

(d) 3r? — 6ry + y2? — 4 = 0 


4. If the expression F(z, y) = Ax? + Bry + Cy? is evaluated at points of the 
unit circle, that is, at points wherez = cos ġ and y = sin ġ, we have 


Vi) =F (cos ¢, sing) = Acos*¢ + Bcos¢singd+C sind. 
Show that 
Vio) = 4[(C + 4A) + B sin 2$ — (C — A) cos 29). 
Show also that 
Vi) = (C+ A) +A cos (26 — 2¢)], 
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where A is defined as in (7-11) and @ is the angle determined as in (7-10). 
For what value of ¢ is this expression a maximum? a minimum? 


5. Equations (7-13) give A’ and C’ for the coordinate system in which B’ = 0. 
What is A’C’ in terms of A, B, and C according to these equations? Does 
this offer another proof of Theorem 7-3? 


7-3 THE QUADRIC SURFACES 
The general quadratic equation in the three variables z, y, and z is 


Az? + By? + C2? + Hay + Jyze + Kez + Dx + Ey + Fz + G = 0. 
(7-14) 


In this section we wish to discuss the point sets in the three-dimensional 
space which can satisfy such an equation. These are called the quadric 
surfaces. 

Just as in the case of quadratic equations in two variables, there always 
exists a rotation of the coordinate system such that the cross product terms 
in (7-14) can be eliminated. The proof of this would take too much time 
to give here and is more difficult than might appear at first glance. For, 
suppose we consider the z-axis as fixed and rotate the z- and y-axes. As 
was shown in the last section, this can be done so as to eliminate the zy 
term. It might seem that we could continue by holding the new y-axis 
fixed and rotating the (new) z-axis and z-axis to eliminate the xz term. 
However, if we try to do this we find that the yz term generates a new 
xy term (why?) 

Actually, several proofs are available. One in particular results from an 
extension of the observations made in Problem 4 of the last section, but it 
too requires techniques we do not want to develop at this point. Instead, 
we will just assume that this has been accomplished and study only equa- 
tions of the form 


Ax? + By? + Cz? + Dr + Ey + Fz + G = 0. (7-15) 


The character of the locus of this. equation changes drastically as the 
various coefficients in it take on positive, negative, or zero values. Since 
there are seven coefficients with three possibilities for each coefficient, 
there are a total of 3’ = 2187 possible cases to consider. We can, however, 
cut the number of these cases down to a reasonable size by making a few 
observations. 

First, if any one of the three coefficients A, B, or C is not zero, then 
we can make the corresponding coefficient D, E, or F respectively zero. 
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This can be done by means of a translation (completing the square in that 
variable). 
For example, we could write 


x? + 2y? — z? + 4r — 4y + 6z — 10 
= (z + 2)? + 2(y — 1)°— (z — 3)? — 3 
a gy! + 2y”? a, 2'2 8) 


Secondly, if any of the coefficients D, E, or F is nonzero while the 
corresponding quadratic coefficient is zero, then we can make G = 0, 
again by means of a translation. An example of this would be 


r? + 2y? — 6z + 8 = x? + 2y? — 6(z — 4) 
= z? + 2y? — 62’. 


Thirdly, if one of the three coefficients A, B, or C is nonzero, we can 
assume it to be A, interchanging the roles of the coordinates if necessary 
(this is a particular case of rotation of coordinates, but one which is 
quite simple). Furthermore, we can assume A to be positive in this case, 
since we can multiply (7-15) through by —1 if necessary. 

Fourthly, if two of the coefficients A, B, or C are nonzero, then again by 
interchanging coordinates if necessary, we can assume them to be A and B. 

Finally, if A, B, and C are all nonzero, then at least two are of the same 
sign. We may assume these two to be A and B, and furthermore that both 
are positive (why ?). 

On page 233 we give a table which lists all of the pertinent cases re- 
maining after the above reductions have been made. In this table, a zero 
means that the coefficient is zero; a + or — indicates that the coefficient 
is nonzero and positive or negative respectively; an x indicates that the 
coefficient is nonzero (the sign being immaterial in that case); and a zero 
in parentheses indicates that the coefficient can be assumed to be zero 
because of one of the above comments. A blank space means that the 
coefficient in that case does not matter. There are 18 cases listed in this 
table. The only cases which have been left out are those in which A, B,C, 
D, E, and F are all zero (why has this been left out?). Of these eighteen 
cases, nine are fairly trivial in one way or another, and the remaining 
nine are of considerable interest. These are indicated by an asterisk in the 
table. 

We turn now to an analysis of these nine interesting cases. Hach time, 
we attempt to build up a picture of the locus satisfying the equation by 
considering the curves which result when a plane parallel to one of the 
coordinate planes is allowed to cut the surface. 
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Locus 
Plane 
Empty set 
Plane 


Two parallel planes 
Parabolic cylinder 


Empty set 

Line 

Elliptic cylinder 
Elliptic paraboloid 


` 
+++ 


0 
(0) (0) 0 
X 


(0) (0) (0) 
(0) (0) (0) 
(0) (0) (0) 


(0) (0) (0) 
(0) (0) (0) 
(0) (0) (0) 


Case II (4). The Parabolic Cylinder 


In this case the equation can be written as 


x? + Ey + Fz = 0 


after dividing through by A. We assume E 
to be nonzero. Any plane orthogonal to the 
x-axis cuts the surface in a straight line, the 


line with equation 


Ey + Fz +a? =0 


where x = a is the cutting plane. 
resulting surface is called a cylinder (Fig. 7-5), since it can be generated 
Any plane orthogonal to the z-axis cuts the 
cylinder in a parabola and all such parabolas are translates of one another 
in the y-, z-directions. The vertices lie on a straight line in the yz-plane, 
and the axes of the parabolas are all parallel to the y-axis. 


by a set of parallel lines. 


Two intersecting planes 
Hyperbolic cylinder 
Hyperbolic paraboloid 


Empty set 
Single point 
Ellipsoid 


Hyperboloid of two sheets 
Elliptic cone 
Hyperboloid of one sheet 


x 





FIGURE 7-5 


All such lines are parallel and the 
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Note that we can make a rotation in the yz-plane so that Ey + Fz = 
E'y’. Then the surface becomes particularly simple. 


Case III (3). The Elliptic Cylinder 


By dividing through by G, we can assume the equation to be in the 


form 
2 


2 
a ¥ 
ap 


where we have written the coefficients of x? and y? in this form to indicate 
that they are positive. This is the equation of an ellipse in the ry-plane; 
and since z does not appear in the equation, if any point is on the locus, 
then the entire line through that point parallel to the z-axis will also be 
on the locus. The surface is therefore again a cylinder, but this time an 
elliptic cylinder (Fig. 7-6). 


Z Z 





FIGURE 7-6 FIGURE 7 -7 


Case III (4). The Elliptic Paraboloid 


The equation can be divided through by |F| and written in the form 


r? 2 
The two cases here are similar, and we will discuss only the one with the 
positive sign on z. The negative sign would merely invert the locus. As- 
suming the positive sign on z, there are no points with negative z which 
satisfy the locus. Setting z = c*, we see that the intersection of the 
surface with a plane orthogonal to the z-axis is an ellipse. All such ellipses 
are similar (have the same eccentricity). A plane orthogonal to the zx- 
(or y-) axis cuts the surface in a parabola. In particular, the planes x = 0 
and y = 0 cut the surfaces in parabolas which pass through the vertices 
of the above ellipses. The surface is illustrated in Fig. 7-7. 
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Case IV (2). The Hyperbolic Cylinder 


Dividing the equation through by |G| we obtain an equation of the form 


2 2 

x yo 

a2 = b2 = +1. 
This is the equation of a hyperbola in the zy-plane. The principal axis is the 
x-axis or the y-axis depending on the sign of the right-hand side. Again, 


the surface is a cylinder with generators parallel to the z-axis (Fig. 7-8). 





FIGURE 7-8 





FIGURE 7-9 
Case IV (3). The Hyperbolic Paraboloid 


By dividing through by |F| we can obtain the equation 


r? y? 


a? b2 
We will consider only the case with —z on the right-hand side. The other 
case is similar. The cross sections of this surface in planes orthogonal to 
the x-axis are parabolas, opening upward. The cross sections in planes 
orthogonal to the y-axis are parabolas which open downward. Finally, 
the cross sections in planes orthogonal to the z-axis are hyperbolas (except 
when z = 0). When z > 0, these have their principal axis in the yz-plane. 
When z < 0, the principal axis is in the xz-plane. For z > 0, the vertices 
of these hyperbolas lie on the parabola y? = b? z. Forz < 0, the vertices 
are on the parabola z? = —a?z. When z = 0 we find 


-DGD 


The locus of this equation is a pair of intersecting lines. A surprising prop- 
erty of this surface is the fact that through every point of the surface 
there exist two distinct straight lines which lie on the surface. This is 
shown in Problem 8 at the end of this section. 


= +2. 
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This surface, as illustrated in Fig. 7-9, is probably the most interesting 
of the quadric surfaces. 


Case V (3). The Ellipsoid 


This equation can be divided through by —G to give an equation of the 


form 
2 2 2 


x y z 
atrpetaatl. 


Every cross section of this surface in a plane orthogonal to one of the 
axes is an ellipse (or a point, or nothing). The ellipses orthogonal to a 
given axis are all similar. The surface is sketched in Fig. 7-10. 





Figure 7-10 





| Figure 7-11 
Case VI (1). The Hyperboloid of Two Sheets 


The equation in this case can be brought to the form 


r? y? 22 


a2 ' bh? ¢2 
Every cross section in a plane orthogonal to the z-axis is an ellipse (or a 
point, or nothing) and all of these ellipses are similar. The cross sections 
in planes orthogonal to the z-axis or y-axis are hyperbolas, with principal 
axis parallel to the z-axis in the xz- or yz-plane. There are no points on the 
locus for —c < z < c and the surface is in two parts (as shown in Fig. 
7-11). 


Case VI (2). The Elliptic Cone 


When the coefficient G is zero, the equation can be written as 


g ‘i ee 
a2 b2 eo o 5 


The cross sections in planes orthogonal to the z-axis are ellipses (except 
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when z = 0). The cross sections in planes orthogonal to the zx- and y-axes 
are, in general, hyperbolas. However, when x = 0 or y = Q these cross 
sections are a pair of intersecting lines. In fact, it is easy to verify that if 
any point is on the locus, then the entire straight line through this point 
and the origin is on the locus. This is the characteristic property of a 
cone. This surface is shown in Fig. 7-12. 





Figure 7-12 Figure 7-13 


Case VI (3). The Hyperboloid of One Sheet 
The equation in this case can be brought to the form 
2 2 2 


x yY N 
a2 b 2o- 


The analysis is similar to that of the hyperboloid of two sheets. The dif- 
ference here is that every plane orthogonal to the z-axis intersects the 
locus in an ellipse. The locus therefore does not fall into two parts. The 
planes x = 0 and y = 0 intersect the locus in hyperbolas whose principal 
axes are orthogonal to the z-axis. The resulting surface is as shown in Fig. 
7-13. Notice that hyperbolas in the xz- and yz-planes pass through the 
vertices of the ellipses found as intersections of the surface with planes 
orthogonal to the z-axis. 

This surface also has the property of being made up of straight lines. 
This is proved in Problem 7 at the end of this section. 


There is a special case which can occur in each of six of the above cases. 
Whenever two of the coefficients A, B, and C are equal (and of the same 
sign) we will have a surface of revolution. When A = B we would have 
surfaces of revolution about the z-axis. This means that if (£o, Yo, Zo) is 
on the surface, then all points of the circle in the plane z = 2g with center 
at (0, 0, zo) are also on the locus. This follows from the fact that a point 
(z1, Y1, Zo) is on this circle if and only if 


r? + yi = ra + ya. 
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Thus if the quadratic equation is of the form 
Az? + Ay?+.---=0, 


this condition is satisfied. 

Surfaces of revolution can be visualized easily. If, for example, it is a 
surface of revolution about the z-axis, we can find the curve of intersection 
of the surface with the zz-plane (setting y = 0) and imagine this curve 
being rotated about the z-axis to generate the surface. 

Surfaces of revolution are of such importance that most of them are 
given special names. 

In case III (3), if A = B the elliptic cylinder becomes a right circular 
cylinder. Similarly, in case VI (2) the elliptic cone becomes a right circular 
cone. 

If A = B in the elliptic paraboloid of case III (4), we obtain a surface 
of revolution which is called just a paraboloid. In cases VI (1) and (3) we 
have hyperboloids of revolution, of two and one sheets respectively. 

Finally, if two of the coefficients are equal in the equation of the ellipsoid, 
we obtain a surface known as a spheroid. Assuming that A = B, a 
spheroid has an equation of the type 


x z? 


+5 oad. 


If c > a, then the spheroid looks something like a football and is called a 
prolate spheroid. Ifc < a, then surface is called an oblate spheroid. The 
oblate spheroid looks like a curling stone (or like a volleyball that is 
being sat upon). What happens when c = a? 

The reader is expected to learn the names (together with the forms) 
of the various quadric surfaces, but he should not attempt to memorize 
which forms of the equation go with which of the surfaces. Rather, when 
faced with the problem of identifying a quadric surface, say, 


4x? — y? + z2? — 8y + 4z + 11 = 0, (7-16) 
he should proceed by working up a picture of the surface by considering 


the cross sections in the planes parallel to the coordinate planes. For 
example, Eq. (7-16) would be rewritten 


4x? — (y + 4)? + (2 + 2)? 


9, 
or 
a (yt+4)’ | @+2)’ 


mo ge ge l. (7-17) 
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(a) (b) 


(c) 
Figure 7-14 


In choosing the cross sections to consider first, we prefer to find planes 
in which the cross sections are ellipses (or circles). If such planes exist, 
it is usually easiest to build up the picture by starting with them. In 
(7-17) we see that any plane y = c will intersect the surface in an ellipse. 
The resulting ellipse has its center at x = 0, 2 = —2. Its principal axis 
will be the line x = 0 and its conjugate axis the line z = —2 (all of these 
being in the plane y = c). (See Fig. 7-14a.) 

All of these ellipses are similar, and they can easily be determined by 
the location of their vertices. The principal vertices are the points of 
intersection of the surface with the plane x = 0, that is, the points on 


2 2 
ee 


This is a hyperbola in the yz-plane with center at y = —4, z = —2 (see 
Fig. 7-14b). 

The conjugate vertices of the ellipses are the points of intersection of 
the surface with the plane z = —2. These are the points in this plane 


which satisfy the equation 
x (y+4)* _ 


9/4 9 
and hence lie on a hyperbola with center x = 0,and y = —4 (Fig. 7-14c). 


l, 
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Figure 7-15 


Collecting the information, we see that the surface is a hyperboloid of 
one sheet whose axis is the line x = 0,2 = —2. The surface is as shown in 
Fig. 7-15. 


PROBLEMS 


1. Show that nine cases of the above table which were not discussed in the text 
have loci as listed in the table. 


2. What relationship exists between the two intersecting planes of case IV (1) 
and the hyperbolic cylinder of IV (2) when A, B, and C are the same in the 
two equations? What relations hold between the planes of IV (1) and hyper- 
bolic paraboloid IV (3)? 


3. What relationship exists between the elliptic cone of case VI (2) and the 
surfaces of cases VI (1) and (3) when 4, B, and C are the same in the two 
equations? 


4, Identify the surfaces defined by each of the following equations. Discuss the 
intersections of planes parallel to the coordinate planes. Sketch the surface. 


(a) 22+ 422 = 0 (b) y2-+ 922 — 362 = 0 

(c) 4x2? + 9y2 + 3622 — 36 = 0 (d) 422 — y?+ 422+ 12 = 0 
(e) x? — 1622+ 48y = 0 (f) y? — 42 + 16 = 0 

(g) 6y? — 4r +z = 0 (h) 622 + 4y? + 422 — 12 = 0 

5. Follow the same directions as in problem 4. 

(a) 4r? — y2 + 1222 — 36 = 0 (b) y2-+ 422 — 16 = 0 

(c) y2? — 92 = 0 (d) 9x? + 922 — 4y = 0 

(e) z? — 8y? — 822 = 0 (f) 8r? + y2? — 8z? — 32 = 0 
(g) 9x? + 4y? + 922 — 36 = 0 (h) 22? — xr + 2y = 0 


6. Identify the surface defined by each of the following equations. Discuss the 
intersections of planes parallel to the coordinate planes. Sketch the surface. 
(a) r? — y? + 422 + 6r — 82+ 14 = 0 
(b) 4x2 + 922 — 12y+ 6 = 0 
(c) 6r? + 2y? + z2? — 24r + 8y — 4z = 0 
(d) 9y2 — 4z2 + 10r = 0 
(e) 4x2 — y24+ 422+ 16 = 0 
(f) z? + 4y?2 — 322 — 27 — 122 — 11 = 0 
(g) x2 — 422+ 62 = 0 
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7. Let Xo = (Zo, yo, 0) be any point on the intersection of the hyperboloid of 
one sheet 


2 
2 


o y? 
ata aT! 
with the plane z = 0. Show that every point of the lines 


X = Xo + tB 
is on the hyperboloid for 


B = [a?yo, —b*z0, abc], 
or 
B = [—a7yo, b2z0, abc]. 


8. Let Xo = (Zo, 0, Zo) be any point on the intersection of the hyperbolic 


paraboloid 
2 2 


x y _ 

ape 
with the plane y = 0. Show that every point of the lines X = Xọ + tB is 
on the hyperbolic paraboloid for 


B = [a?, ab, —2zol, 
or 
B = [a?, —ab, —2zo]. 


9. A quadric surface is called central if and only if there is a point P, called the 
center, such that whenever the point X is on the conic then so is the point X’, 
where 

X’ = P — PX 
(ie., PX’ = —PX). 
Which of the quadric surfaces defined in this section are central? What are 
their centers? 
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We have made an identification between points of the cartesian plane 
and vectors. Thus, if X is the point (x, y), we identify it with the vector 


X = OX = ze, + yez 


(O being the origin). The vector can then be thought of as determining the 
point. However, we can specify a vector by giving its length and direction 
instead of its cartesian coordinates. 

If X is an arbitrary nonzero vector in the cartesian plane, then |X| is 
its magnitude and e, = X/|X| is a unit vector in the same direction. Here, 
we use the subscript r to indicate that the unit vector is in the radial 
direction. A subscript 0 would be more to the point, since this vector 
depends on 6, but standard usage calls for e, with the dependence on 6 
being understood. 
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Since e,-e; = cos6 and e,-e, = sin@ (why?), we have e, = 
cos 0 e, + sin 0 eə. We thus have the motivation for the definition of 
polar coordinates. 


Definition 7-1. The polar coordinates of a point X in the cartesian plane 
are a pair of real numbers (r, 6) such that 


X = re,, (7-18) 
where 
e, = cos ĝe, + sin ev. (7-19) 


Several comments must be made about this definition. First of all, it is 
clear that any given pair of real numbers (r, 0) determine a unique point 
from these two conditions. Normally we think of the number r as being 
|X|, the distance of the point in question from the origin, but this is not 
necessary in this definition. The number r could just as well be —|X|. 
This, however, is the only other possibility (why ?). 

On the other hand, a given point has many different sets of polar co- 
ordinates, an infinite number in fact. If a point is fixed and we set r = |X|, 
then there is exactly one 69 withO < 69 < 27 satisfying the requirements 
of the definition. Any angle 6 = 6) + 27k, k = 0, +1, +2,... will also 
satisfy the definition. Choosing the other possibility r = —|X| requires 
the angle 6; = 0) + m. Here again we can add any multiple of 27 and 
still satisfy the conditions. 


Polar coordinates are useful in many applications. In particular, they 
may be used to specify a curve in the plane, just as with cartesian co- 
ordinates. In general, this is done by giving an equation of the form 


r = f(6) (7-20) 


to specify the locus. We sometimes find 6 as a function of r, or find an 
equation in both r and 6, but equations of the form (7-20) are the type 
which occur most often. 

There are two types of questions which arise. First, given an equation 
of the form (7-20), what is its locus? Second, given a curve in the plane, 
what is the polar coordinate form of its equation? Both of these questions 
will be considered with the help of a few examples. We note first that the 
conversion from polar coordinates to cartesian coordinates or vice versa is 
aided by the relations 

x = r cos 0, 


y = rsin 9, (7-21) 


2 2 2 
ae a 
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which are easily proved from the definition. Some caution is required in 
the use of these relations, however. When they are used to transform from 
one kind of coordinate to the other, we must always check that no ex- 
traneous points have been introduced into the locus, and that no points 
have been lost. 

Let us now look at a few loci defined by their equations in polar coordi- 
nates. The general method of analysis is to determine a number of points 
on the locus, and to discuss the behavior of r as 6 varies to see how to 
connect these points. A sketch may then be made. 


Example I. THE CIRCLE 
= a. 


Here, r is a constant function of 9. The locus is the set of all points at 
the distance |a| from the origin. (Why the absolute value?) It is therefore 
a circle with radius |a|, centered at the origin. 


Example Il. THE CIRCLE 
r = 2a cos 0. (7-22) 


When @ = 0, the point (2a, 0) is determined. As @ increases from 0 to 
aw/2, r decreases (if a > 0) from 2a to zero. When 6 goes from 7/2 to 
m, r is negative and we get points in the fourth quadrant. When @ = 7, 


ry = —2a and we discover that we again have the point (2a, 0). As 6 
increases from 7 to 27, the points previously obtained are obtained again, 
since cos (6 + 7) = — cos @. (See Fig. 7-16.) 

Any point satisfying this equation must also 
satisfy 


r? = 2ar cos 6, 
and hence from (7-21) must also satisfy 


2 2 = 2 : 
aa ss FIGURE 7-16 


This, however, is the equation of a circle with center at (a, 0) and radius 
la|. This discussion shows that the locus of r(r — 2a cos 0) = 0 is exactly 
this circle. This equation was obtained from (7-22) by multiplying by 
r. No points of the locus of (7-21) will have been lost by this process. 
However, an extraneous point may have been introduced into the new 
locus. When r = 0, the new equation is satisfied. Hence this point is on 
the locus we have obtained, but may not be on the locus of (7-22). We 
see, however, that when 6 = 7/2, the point r = 0 results in (7-21). 
Therefore, the locus of (7-21) is exactly the circle described here. 
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Example Ill. THE 4-LEAFED ROSE 
r = a cos 20 (a > 0). (7-23) 


A discussion such as given above shows that the locus of this equation 
is as shown in Fig. 7-17. The dashed lines, at 6 = +7/2, +377/2, are 
the rays at which r = 0. Drawing these in helps to sketch the locus. 
Note the order in which the leaves are traced out as @ goes from 0 to 27: 
first, the upper side of the right-hand leaf, then the left side of the bottom 
leaf, and so on. 





FIGuRE 7-17 


A technique that is often helpful in making sketches of the loci of polar 
coordinate equations is to make a preliminary sketch of the locus of the 
equation in a rectangular (r,@)-coordinate system. The sketch of (7-23) in 
such a coordinate system would look like Fig. 7-18. 


oe A oe ae 
= ea 


FIGURE 7-18 


From such a sketch, it is easy to pick out the critical angles which 
should be marked on the polar-coordinate plane to help with the sketch. 
These are the values of 6 for which r = 0 (7/4, 37°/4, 57/4, 77/4 in this 
case) and values of 6 at which r takes on local maximum and minimum 
values (0, 7/2, 7, 37/2 for the locus discussed here). 

The reader is advised to make such a sketch for each of the examples of 
this section and to see how the rectangular coordinate sketch is related to, 
and helps to obtain, the polar coordinate sketch. 
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Example IV. THE 3-LEAFED ROSE 
r = acos 36 (a > 0). (7-24) 


This equation has a locus as shown in Fig. 7-19. An analysis similar to 
that above shows that the curve is traced out twice as @ goes from 0 to 27. 





Figure 7-19 Figure 7-20 


Example V. THE LEMNISCATE 


r? = a’ cos 20. (7-25) 


The analysis here is similar to that of Example III, except that now the 
upper and lower leaves cannot appear since cos 20 is negative for 7/4 < 
6 < 37/4 and 57/4 < 0 < 77/4. (See Fig. 7-20.) 


FIGURE 7-21 


Example VI. THE CARDIOID 
r = a(l — cos 0). (7-26) 


The resulting curve is illustrated in Fig. 7-21. The name, cardioid, 
comes from the heart shape of the figure. The cardioid finds many uses 
as examples in calculus courses. 
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Next we turn to the opposite problem: Given a locus, how do we find its 
equation? Let us start with a simple one. 


Example Vil. THE STRAIGHT LINE 
CG: 
Using relations (7-21), we see that every point of this line satisfies 


r cos ĝ = a, 


or 
r = asec 0. (7-27) 


An analysis such as done above can be made to show that (7-27) is indeed 
the equation of this single straight line. 


As our final example (which we do not label, since it will result in a 
theorem), let us find the polar form for the equations of the conic sections. 
To do this, we use the focus-directrix property of the conic sections. Let 
the distance between the focus and the directrix be p, and let the eccen- 


tricity be e. We can assume that the focus is at the origin and let t = —p 
be the directrix. The equation of the conic section is then 
|X| = e| XE], (7-28) 


where |XE]| is the distance between the point X and the directrix. 

There are four possible cases which should be considered: where the 
point X is to the right or left of the directrix, and where r = |X| or — |X|. 
Let us consider only two of these here. The others will be left as exercises. 
First, suppose X is to the right of r = —p, and r = |X|. In this case, the 
signed distance from X to the y-axis is K-e, = |X| cos 6 = r cos 6, and 
hence |XE| = p + r cos 0. Equation (7-28) then becomes 


r = eļ(p + r cos 0), 


which can be solved for r to give 


L= pe 


l — e cos 0- 


Next, suppose that X is to the left of r = —p, and that r = —]|X|. 
The signed distance of X from the y-axis in this case is |X| cos (0 + r) = 
— |X| cos 0 = r cos 0. This number is negative. (This case is illustrated 
in Fig. 7-22 with the primed coordinates.) Thus 


(7-29) 


|XE| = —r cos 6 — p, 
and Eq. (7-28) becomes 
—r = e(—r cos 6 — p). 


Solving for r again results in (7-29). 
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FIGURE 7-22 


Without bothering with the other cases, let us turn to Eq. (7-29) and 
see what the actual locus of it will be. The above discussion shows that 
every point on the locus of (7-29) will be on the conic, but the question is, 
will the entire conic be covered? 

In the case of an ellipse 0 < e < 1, and it is easily seen that all the 
points on the ellipse are obtained. For each 0, (7-29) gives a positive value 
for r, which is always less than p when 7/2 < 0 < 37/2. 

For a parabola, e = 1 and r is undefined in (7-29) if 0 = 0. All other 
values of 6 between 0 and 27 give positive values of r. Again when 7/2 < 
6 < 37/2, r is less than p. 

In the case of a hyperbola, e > 1, and r is undefined when cos 6 = 1/e. 
Suppose cos a = 1/e where a is between 0 and 7/2. Then for —a < 
0 < a, we see from (7-29) that r is negative. Indeed we can verify that 
the resulting point is to the left of the directrix. As 6 varies within this 
range, one entire branch of the hyperbola is swept out. When a < 0 < 
27 — a, we see that r is positive in (7-29) and the remaining branch of the 
hyperbola is produced. 

A sketch of the hyperbola that results is shown in Fig. 7-23. Note that 
the lines determined by 6 = t+aand 6 = +a + 7 are not the asymptotes 
of the hyperbola. They are parallel to the asymptotes, however, and are 
shown dashed whereas the asymptotes are represented by solid lines in 
Fig. 7-23. 


FIGURE 7-23 
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Putting together the above observations, we find that we have proved 
the following theorem. 


Theorem 7-4. The equation in polar coordinates 


pe 


"~T— ecosé 


has as its locus a conic section with eccentricity e, a focus at the origin, 
and associated directrix x = —p. 


PROBLEMS 


1. If a locus is defined by the equation r = f(@) and if f(@) is periodic with 
period 27, show that the locus has at most two points other than the origin 
in common with any line through the origin. 


2. What are the polar coordinates of the origin? 


3. If relations (7-21) are used to transform an equation in cartesian coordinates 
to one in polar coordinates, will any points of the locus be lost? 


4. Sketch the Spiral of Archimedes, 
r = að, a> 0. 


What happens if a is negative? Sketch this case also. 


5. Find the equation in polar coordinates of the circle 
(x — a)? + (y — b)? = R?. 


Can you solve for r as a function of @ in the resulting equation? What hap- 
pens if R? = a? + b?? 


6. If S is the locus of r = f(6) and if the plane is rotated about the origin 
through an angle a, leaving the coordinate system fixed, so that the point 
set S becomes the point set S’, what is the equation of S’? 


7. Sketch the loci of the following: 


(a) r = 2asin6,a > 0 (b) r 
(c) r = a(l+cos@),a > 0 (d) r 
(e) r = acsc0,a > 0. 


asin 26,a > 0 
a(l — sin@),a > 0 


8. Discuss the locus of 
r= acosn@ 


(a) if n is an even integer; (b) if n is an odd integer. 
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9. What is the equation in polar coordinates of the line 
xzcosa+ ysina = p? 


10. A locus is said to be symmetric with respect to the origin if whenever X 
is on the locus then Y is also on the locus where Y = —X. Prove that if 
f(@) is periodic with period r, then the locus of r = f(@) is symmetric with 
respect to the origin. 


11. Prove that if f(m — 0) = f(0) for all 0, then the locus of r = f(0) is sym- 
metric with respect to the y-axis. 


12. Prove that if r is given as a function of @ which involves only sin 0, then the 
resulting locus is symmetric with respect to the y-axis. 


13. Prove that if f(—6) = f(0) for all 8, then the locus of r = f(0) is symmetric 
with respect to the z-axis. 


14. Prove that if r is given as a function of 6 which involves only cos 0, then the 
resulting locus is symmetric with respect to the z-axis. 


15. If f(—6) = —/f(6) for all 6, what are the symmetry properties of the locus 
of r = f(6)? Note that f(0) = 0 because of this condition. 


16. Let a function f(@) be given and suppose that another function g(@) is such 


that g(0) = —f(@-+ r). Prove that the polar-coordinate equations 
r = f0) 

and 
r = g(@) 


have the same loci. 
17. If an equation of a locus in polar coordinates is 
r = f(cos 0), 
what is the locus of the equation 
r = —f(—cos 0)? 


18. Use Eq. (7-28) to obtain r as a function of 0 for the two cases not discussed 
in the text. How is the locus of the resulting equation related to the locus 
of (7-29)? 


19. Lete > 1, cosa = 1/e, with 0 > a < r/2. Prove that if —a < 0 < a, 
then the point with polar coordinates (r, 0) defined by (7-29) is to the left of 
the line x = —p. 


20. What are the loci of the following fore < 1? e = 1? ande > 1? 


aaan Pe on Zua a 
Dre econ (b) 1 — esin 0 
0) r = E 


l + esin 9 
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21. Identify and sketch the loci of 


6 4 
BS oo cob (b) r = eae 

9 4 
eS an (d) t = § sand 


22. Sketch the limaçon 
r=a+b cos 

(a) 1f0 < a < b; 

(b) f0 <b <a. 


23. Prove that the lemniscate (7-25) is the locus of all points with the property 
that the product of their distances from the points (a/v/2, 0) and (—a/V2, 0) 
is a2/2. [Hint: Use the z-, y-coordinates of the point which are given by the 
polar coordinates of the point.] 


Answers to Selected Problems 


Section 1-1 
1A=DCCCB=E; A 
2. (a) {x|z2 < 2} (c) {ala 

(e) {x|z > O and z? < 2} 
3. (a) {x|0 <2 <1} (b) O 


DCF 
2k, k an integer} 


4. Yes 


Section 1-2 


1. Axiom 8 fails 6. All hold. 
13. (a) Commutative and associative (c) The number 0 is an identity (on 


the right). There are right and left hand inverses. a ° (—a/2) = 0, 
(—2a) -a = 0. 
14. (b) closed under *, A, and o. 


Section 1-4 
2. (a) m is the maximum of A if and only if (t) m is the least upper bound 


of a, and (ii) m isin A. 
3. (b) y2 — 2 = (z? — 2)?/4r? 


Section 1-5 
4. If and only if a and b are both nonnegative or both nonpositive. That 


is, if and only if ab > 0. 
(e) =l i} 


7. (a) —5,9 (c) —2, 3 
8. (a) x > 8andz < 2 (c) z > landz < —3 


Section 1-6 
1. (a) —289 (c) 0 2. (a) —15 (c) 0 
3. (a) —4, —15 (c) 4, —3, 3 


2 5 2 
6 al 





7. (a) p 2 
3|, _5| +6 
E s| i 5| — ; 
He aitse a 


9. 5 
251 
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Section 2-1 


2. Symmetrically located with respect to: (2) the y-axis, (ii) the z-axis, 
(iii) the origin. 
3. (a) 4V2 (e) 2V53 (e) 14 
5. (a) (z — 1)? + (y — 2) = 16, z? + y? — 2r — 4y — 9 
(c) (z +5)? + (y — 3)? = 1, z? + y? + 10x — 6y + 33 
6. (a) z? + (y — 2)? = 4, z? +y? — 4y =0 
(c) (t + 2)? + (y + 2)? = 64, r? + y?+ 42+ 4y — 56 = 0 
. (a) (—1, 2),3 (c) No locus 
. (@) (—1, 3),2 (c) (—2, —3), $ 
. The single point (zo, yo) 
11. (a) z? + y? — Te + 1y = 0, ($, —3}), V170/2 
13. (x — 3)? + (y + 7)? = 90 


Section 2-2 


0 
0 


co on 


2, z — ee —cz=0. As m becomes large, the equation tends toward 
m 
zr—c= 0. 
4. (a) y = 2x — 11, 2r — y — 11 = 0 
(c) y = 54+ 10, 54 —y+10=0 


(e) y = —8r¢+ 43, 32+ 5y — 13 = 0 
5. (a) y = —3x — 48, c+ 2y4+ 13 = 0 
(c) y = —z5r + p 1 H 50y —1=0 
6. (a) m = 2, 2r — y — 5 = 0 (ec) noslope, x — 4 = 0 
(e) m=0y—5=0 
7. (c) m = 1,xz — y — 12 = 0 (e) m = —94, 942 + y — 690 = 0 


Section 2-3 


1. The line cannot be parallel to the y-axis. If y = mz + b is the equation 
of the line, then f(z) = mz + b. 

2. f(z) = [R? — x?]!/2, domain is {x| —R < xz < R}, image is 
{y l0 < y < R}. 


Section 2-4 


2. (a) Yes, (z, y) > (1 +h+ kR, y +k+k) (b) Yes 
(c) Yes, if T is given by (z, y) —> (x + h, y + k), then T’ is given by 
(z, y) = (x 745 h, y aa k). 
4. {(z, y)| |z — 2| + ly + 4| = 1} 


Section 2-5 


2. (a) w/2 (c) 3r/2 (e) 3r/2 (g) w/2 
(i) m/4 (k) rt/3 (m) 77/6 (0) 

3. (a) 360 a/2r 

4. (a) 90° (c) 270° (e) —90° (g) —270° 
(i) 45° (k) 60° (m) 675° (0) 22860° 

7. a+ B = Y+ some integral multiple of 2r 


ANSWERS TO SELECTED PROBLEMS 253 


Section 2-6 
1. (a) æ = kr, k any integer (c) a = 2kr + 7/2 (e) a = 2k — w/2 
3. (a) —sin (27/5) (c) sinO (e) cosO (g) —cot (1/6) 
5. (a) —sin 23° (c) sin 5° (e) —tan 5° (g) —sec 35° 


Section 2-7 
1. (a) c = 7/2 (c) b = 5cota (e) a = ccosB 


a b c COS @ cos B cos Y 

3. (a) 30 20 —76 

(o) Bo R -H 
4. (a) 9 3 15 

© 5 % z 
5. (a) Impossible 

(c) 3 3 3 

(e1) 5 =E 3 

(2) 3 OH 
6. (a) 4 2 I5 


(c) Impossible 


Section 3-1 
1. (a) 7 (c) v94 2. The only point is (1, 1, 2) 
3. (a) x? + y2? + 22 — 6r — 2y + 4z — 11 = 0 
4. It is a sphere if B? + C2? + D? > 4AE. The center is at (—B/24A, 
—C/2A, —D/2A). The radius is [(B2 + C? + D? — 4AE)/4A?]!/2. 


5. (a) (3, =l —2), 3 (c) (3, = 2), V 301/6 
6. z? + y2? + z2 — 6r + 2z — 15 = 0, center (3, 0, —1), radius 5 


Section 3-2 


1. (a) (7,4,3), (© (—V3/9, —v3/9, 5V3/9) 

2. (a) (—5, 4, —4), 57, (—5/V/57, 4/\/57, —4/V57) 
(c) (1, -1, 0), V2, (1/\/2, —1/v/2, 0) 

6. (a) (3, 3, 5), (3, 4 3) 45), (= 4343 43 
(c) (3,3 9) 1), (4, $, 1), (2, 3 3) 1) 


Section 3-3 


= (a; + 61, a2 + be, a3 + b3) 
2. 100 cos 8, where £ is the angle between the string and the vertical. 


Section 3-4 
4. (a) 7,3, —4 (c) 0,0,0 
5. 5,0, —2 Ke) Oto AAR klp ml 
8. (a) [}, —$, 3) (œ 4, 0, -5 9. (b) All but P8 
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Section 3-5 
2. (a) 4$, #8 48 , 2a (a) [43%, £2 2 —41486]) 


3. (a) (3. 28 sl (c) (0, 0 0, 0) (e) ($3, —3$, $3] 
4. aj@1, 42€2, a3e3 5. (a) —3 (c) ot 
7. [—2, 1, 0] 


Section 3-6 


(Hea) < (Be) (Ee) 


i=l 


7. This reduces to the triangle inequality for scalars. 


Section 4-1 

1. (a) c+ 2y+72+7 = 0 (c) 2 = 0 

3. (a) 5z + y + z — 10 = 0, (0, 0, 10) 
(c) a+ y+ 2z = 0, (0, 0, 0) 

4. (a) (0, 0, —5), [8, —1, 1] (c) (0, 0, 0), [2, 1, 4] 

5. 2r — y+ z — 4 = 0. One of the three quantities a, b, or c must be 
nonzero. If a =Æ 0, say, then there are really only three unknowns: 
b/a, c/a, and d/a. 


Section 4—2 
1. (a) [—5, 8, —19] (c) [—4, —8, —4] 


5. X €i e2 e3 

€i 0 e3 —e2 

eo | —e3 0 ej 

e3 @2 —e) 0 
7. (a) 83t t+y—2+4=0 () t—2z4+1 = 
9. (2, —7, —1] 


Section 4-3 
1. (a) 22//83 = (c) 1 


2. If the sign is positive, the vector from the plane to the point is parallel 
to the given normal vector. If the sign is negative, this same vector is 
collinear with, but not parallel to, the given normal. 


4. (a) (1 — 88,3 — 28414 d (c) [—2, 1, —4], 
V 83 Vv 83 V/ 83 


o> 


. (a) —~e, (ce) —4 (e) 0 
9. Plane: 2x + 2y — z+ 5 = 0, distance 6 
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Section 4—4 
1. (a) X = [1, 2, 7] + a4, 1, 6] 
(c) X = [11, 12, 13] + t[9, 11, 14] 
(e) X = (0, 1, 2] + ¢0, 0, 1] 
2. (a) X = [1, —1, 0] + ¢{[—1, 5, 13] 
(c) X = (1, l, —1} 


3. The value of ¢ n on the particular equation, but the point is: 


(1c) (0, —48 32) de, 30 "—28), (34, 25, 0), (le) ~y rs) (0, l, 0) 
(2a) (0, 4, 13), g 0, 73%), (1, =p 0), a iO 0, 0) 
(2e) (0, ast — 839), ($8, 0, ze 31), (23 > 0) 


7. (a) t = —3, (2 9, —11) (c) t = —2, (3, 8, —15) 
9. X = [1, —1, 2]+ #3, —2, 1] 

11. (a) (5, 1, —10) (c) (7, 0, —3) 

12. (a) X = [5, 7, —10] + t[16, 31, 9] 


13. (a) V/143/7 (e) v61/v65 








Section 5-1 
4. (a) [—16, —82, 36] 6. (a) V 899 (c) V 276 
7. 2 8. (a) V 66 
12. aı a2 a3 
V = |bi b2 be 
C1 C2 C3 
14. X = U X B where U is any vector such that A-U = —1, hence the 


complete solution is X = B X A/|A|? + cA, for any c. 
16. (a) (A -B)A — (A -A)B (d) (A-A)?(A X B) 
(e) (A - A)? (A -B)A — (A-A)°B 


Section 5-2 


D-BXC _A-DXC _A-BXD 
A-Bxc’ 4” A-BXC’” A-BXC 


A solution exists if D = 0, or if D # 0 and A, B, and C are not coplanar 
(are linearly independent). 

11. (a) C = 4A — B (c) C = —5A + 6B 

12. (a) D = 2A + 4B — C (c) D = 10A + 2B — 2C 


a= 


Section 5-3 
1 1 
1. —— |, 3,0). =—|—3, £0), 0, 0, 1 
@) F130, Fl , [0,0, 1] 
Oph en ea ee 


v14 V'826 v59 
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2. B = a,u, + aete + agu3, where a1, a2, and ag are: 























4v 10 2V 10 56 22 
a) ~—, “2 () Qe, 
826. 59 
1 3 —3 1 
6. (a) 2’ = z+ —y, y’ t + — y, =z 
V 10 vV 10 ~ 4/10 V 10 
2 1 3 24. 9 13 
(c) 2’ = I =—Sy = zy = Tem y a z 
Ji4 va va 826 826 826 
fa ope a, Sst 
59 vV 59 vV 59 
7. (a) a! = By + $ = +8 — pay + ree, 


z’ 23x +4 ay = 882 
A 5 {u2 = ggu. 
9. (a) (—33, —2#2, —#] () Ws, —®&, sl 
10. (a) [1, —1, 0], (0, 1, —1], (0,0, 1) (c) b3 1: — 1l, fo, —& —), 


(—7, ER 1] 
Section 5—4 


1. (a) [1,2,0] (c) =[162, —105, 211] 
2. (a) 35[3, —1, 10] (c) 14154, 6, 38] 


4. (a) Pı: V11/2, P3: V 1782/83 (b) L2: 20/59, L4: 2/ V6 


Section 6-1 
1. (a) 82? — y — 1)? -—(— 1)? =0 ©) y? ara a = 
2. (a) 3r? — y2? + 2y — 2 = 0, hyperbola (©) ; + y? = 0, parabola 
5. (a) (bf — N?)x? + Qbibexy + (b3 — d?)y? 
(c) B = (0, V2/2, V2/2],\ = V2/2, for is give x? = 0. 


GO 
@ Il 1 


Section 6-3 
1. (a) b? = a?(1 — e?), c = ae, d = a/e 
(c) b? = a? — ¢?, d = a?/c, e = c/a 
(e) a? = b? + c?,d = (b? + c?)/c, e = c/[b? + c?]}/? 


(g) a = c/e, b? = c2(1 — e?)/e?,d = c/e? 
(i) a = de, b? = d2e?(1 — e?), c = de? 
2. (a) det — de? + b? = 0 
4. (a) F = (0, 3), (0, —3), PV = (0, 5), (0, —5), CV = (4, 0), 
(— 4, 0), e = 2,dir:y = 25 y = —*2. 
(c) F = (\/65, 0), (—V/65, 0), PV = (9, 0), (—9, 0), CV = (0, 4), 
(0, —4), e = 0/65/9, dir: £ = 81/\/65, x = —81/\/65. 
(e) F = (+1,0), PV = (+V5, 0),CV = (0, +2),¢ = 1/V5, 
dir:z = +5 
(g) F = (+V6,0), PV = (+V/10, 0), CV = (0, +2), e = V/15/5, 
dir: z = +5v6/3 
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2 2 
5. (a) > + E =i (c) 2427 -+ 25y? = 216 
(e) 24r? + 40y2 = 960 (g) 9r? + 25y? = 144 
6. (a) F = (+1, 0), PV = (3,0), CV = (0, +2V2), e = 4, dir: 
x= +9 
(c) F = (+3,0), PV = (+3,0),CV = (0, +6V6/5), e = 4, dir: 
x = +15 
(e) F = (+4,0), PV = (+7Vv40, 0), CV = (0, +V/24), e = V/10/5, 
dir: z= +10 


(g) F = (++£, 0), PV = (+4,0),CV = (0, +4), e = , dir: 
= +5 


Section 6—4 


2. (a) pd = 8, cd = 6, V = (+8, 0), F = (+10, 0), e = 3, dir: 2 = 
+32 asym: y = +32/4 
(c) pd = 3, cd = 2, V = (0, +3), F = (0, +v 13), e = vV 13/3, 


dir: y = +9/V 13, asym.: y = +32/2 

(e) pd = 3, cd = 4, V = (+3, 0), F = (+5, 0), e = š, dir:x = +2, 
asym: y = +$ 

(g) pd = 6, cd = 2, V = (+6, 0), F = (+2v10, 0), e = vV 10/3, 
dir:z = +18/V/10, y = +22/6 


2 J 2 2 
x = A 25y = x yo 
4. The equation is 
2 y? uth 
a bo à 


The two hyperbolas have the same asymptotes. 
9. Any pair determine a unique hyperbola: 
(a, b) c? = a? + b?, d? = at/(a? + b?), e? = (a? + b?)/a? 
(a, d) b? = a?(a? — d?)/d?,c = a*/d,e = a/d 
(b, e) a? = b?/(e? — 1), c? = b?e?/(e? — 1), d? = b?/e?(e? — 1) 
If b and d are given, then e must satisfy the equation 


et — e? — b?/d = 0. 


This equation is quadratic in e?. One of the two possible solutions is 
negative. Hence there is only one e which can satisfy this equation. 


Section 6-5 


I; axis directrix focus 
(a) t= 0, y= =4; (0, 4) 
(c) x = 0 y= 3, (0, —4) 
(e) t= 0, y = E, (0, — 5) 
(g) r=0 y = —ys; (0, zs) 


258 ANSWERS TO SELECTED PROBLEMS 


2. (a) y2 = 8x (c) y2? = 2x (e) z? = —4y/3 (g) y2? = 2z 
(i) 2? = —12y/5 

4. (a) 9z? — 400r + 16y? — 300y — 24ry = 0 
(c) z? — 4r + y? — 12y — 2zry + 20 = 0 


Section 6—6 
1. (a) hyperbola, C= (1, 3), F = (1, 8), (1, —2), y = (1, 1), (1, —1), 
pr axis: x = 1, conj axis: y = 3, dir: y = 34, y = —Z, asym: 4% — 


3y + 5 = 0, 42+, 3y — 18 = 0, e = § 
(c) ellipse, C = (—1, 7), F = (—1, 11), (—1, 3), PV = (—1, 12), (—1, 


2), CV = 53 7), — —4, 1), pr axis: x = —1, conj axis: y = 7, 
dir: y = 23,y = ge =i 

(e) ellipse, C = = (3, —1), F = (6, —1), (—2, —1), PV = (8, —1), 
(—2, —1),CV = (3,3), (3, =) pr axis: y = —1, conj axis: z = 3, 
dir: T= 34 r =—lb e= 

(g} parabola, V = (4, 2), F = @, 2), axis: y = 2, dir: z = —1 

(i) circle, center (3, —4), radius $ 

2 2 
3. (a) S29 4G a O eT? aud), 
o ei L ae a 
1/4 ` 


4. y2? = —(1 — e?)zr? + 2(1 + e)z; as e — 0, equation becomes (z — 1)? + 
y? = 1, which is the equation of a circle of radius 1 with center (1, 0); as 
e — 1, the equation becomes y? = 42, which is the equation of a parabola. 


Section 7-1 
2. x = x' cos 0 — y' sin, y = x sin 0 + y’ cos@ 


ab? — (b aaar 2 2 
4. y = O 5. 160r — 24ry + 153y” = 24,336. 


Section 7-2 
3. (a) parabola, sin 0 = $, cos 0 = 3, 


25x? + 28r’ — 104y + 100 = 0 
(c) hyperbola, 6 = 7/6, 
2x? — 2y’? = —1 
4. Maximum when ọ = @, minimum when ¢ = 0 + 7/2 


Section 7-3 


2. The planes are made up of the asymptotes of the hyperbolas. 
4. (a) elliptic cone (c) ellipsoid (e) hyperbolic paraboloid 
(g) parabolic cylinder 
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5. (a) hyperboloid of one sheet (c) two planes (e) circular cone 
(g) prolate spheroid 
6. (a) A hyperboloid of two sheets centered at (—3, 0, 1). Vertices at 
(—3, 1, 1) and (—3, —1, 1). 
(c) An ellipsoid with center at (2, —2, 2). 
(e) Hyperboloid of two sheets with center at the origin and vertices at 
(0, 4, 0) and (0, —4, 0). 
(g) A hyperbolic cylinder with axis x = —3, z = 0. 


Section 7-4 


2. r = 0 and any 0 6. r = f(0 — a). 
9. r = p sec (0 — a) 17. It is the same 
21. (a) Ellipse (b) parabola (c) hyperbola (d) ellipse 


Index 


absolute value, 20 

addition, 6 

addition formulas, 76 
addition law, 12, 14 
alternative order axioms, 14 
angle, 56-62, 136 
anticommutative, 128, 129 
arc length, 58 

associative law, 7, 128, 101 
asymptote, 51, 204, 205 
axiom, 2, 6 

axis, 179, 196, 197, 204, 210, 211 
axis of rotation, 165 (Prob. 8) 


bilinearity, 112, 129 
binary operation, 6 
binary relation, 6 
bounded, 18 

bound vector, 96 


cancellation law, 112 

cardioid, 245 

cartesian coordinates, 80 

cartesian plane, 33 

Cauchy-Schwarz inequality, 115 

center, 84, 196, 197, 204, 241 
(Prob. 9) 

central quadric surface, 241 (Prob. 9) 

circle, 35, 181, 184, 243 

closure, 7, 8 

cofactor, 29 

collinear, 100, 131, 152 

commutative law, 7, 101 

completeness, 16 

completeness axiom, 18 

complex numbers, 12 

component, 95 

cone, 179, 237, 238 

conic section, 179-185, 248 

conjugate axis, 196, 197, 204 

conjugate dimension, 197, 204, 205 


conjugate hyperbolas, 208 (Prob. 4) 
conjugate vertices, 197 
coordinate, 18, 34 

coordinate axis, 18, 33 
coordinate line, 18, 33 
coordinate system, 162 
coordinate vectors, 108, 159-164 
coplanar, 153 

cosecant, 69 

cosine, 63, 110 

cotangent, 69 

cross product, 126-132 

cylinder, 233, 238 


Dandelin, G. P., 186 
Dandelin sphere, 187 
decimal expansion, 17 
degree, 58, 62, 63 (Prob. 5) 
determinant, 25-31 
difference, 98 

directed line segment, 90, 96 
direction cosines, 89, 90 
direction numbers, 88, 90 
directrix, 188, 192, 202, 210 
distance, 19, 35, 83 

distance formulas, 133-136, 169-172 
distributive law, 7, 129 
domain, 46 

dot product, 110° 

dual basis, 165 (Prob. 10) 
Dürer, Albrecht, 196 


eccentricity, 188 

element, 2 

ellipse, 181, 184, 186, 191-199, 228 
ellipsoid, 236 

elliptic cone, 236 

elliptic cylinder, 234 

elliptic paraboloid, 234 

empty set, 4 

equality, 6 
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equivalent, 36 
euclidean space, 104 
extended Lagrange Identity, 148 


field, 6-11 

field axioms, 7 

focal dimension, 197, 204 
focus, 186, 192, 202, 210 
formal proof, 10 
formula, 46 

free vector, 96 

function, 46 


generator, 174, 180 

geometric angle, 57 

Gram-Schmidt orthogonalization, 178 
graph, 48 

group, 101 


half angle, 179 

Hilbert, D., 88 

Hilbert space, 112 

homogeneity property, 128 

hyperbola, 183, 184, 187, 200-207, 228 
hyperbolic cylinder, 235 

hyperbolic paraboloid, 235 
hyperboloid of one sheet, 237 
hyperboloid of two sheets, 236 


identities, 145-150 
identity, 7, 101 

image, 46 

inclination, 56 

initial side, 57 

inner product, 110 

inner product space, 112 
intercept, 41 

inverse, 7, 101 

irrational numbers, 15, 17 


Lagrange’s identity, 148 

law of cosines, 75, 114 (Prob. 11) 
law of sines, 73, 151 (Prob. 17) 
least upper bound, 18 
lemniscate, 248, 250 (Prob. 23) 
length, 90 

limaçon, 250 (Prob. 22) 


line, 39, 92, 138, 246 

linear dependence and independence, 
156, 173 

linearity property, 107 

linear space, 102 

locus, 36 


magnitude, 95 

major axis, 196 

mapping, 48 
mathematical model, 1, 82 
maximum, 19 (Prob. 2) 
measure, 59, 60 
measuring line, 16 

method of replacement, 175 
midpoint, 91 

mil, 58, 63 (Prob. 6) 
minor, 29 

minor axis, 196 
multiplication, 6, 
multiplication law, 12, 14 


nappe, 179 

nondegenerate conic section, 184 
normal form, 137 (Prob. 5) 

null set, 4 

number line, 15 


oblate spheroid, 238 
order, 6, 12 

order axioms, 12, 14 
ordered pair, 46 
ordered triple, 82 
order relation, 12 
orientation, 57 
oriented geometric angle, 57 
origin, 18, 34, 82 
orthogonal, 110, 121 
orthogonalization, 178 
orthogonal set, 160 
orthonormal set, 160 


parabola, 183, 184, 210-212, 228 
parabolic cylinder, 233 
paraboloid, 238 

parallel, 100, 121 

parallelogram of forces, 94 


parallel translation, 84 

parametric equations, 139 

periodic functions, 63, 64, 249 
(Prob. 10) 

perpendicular, 110 

plane, 120-124 

point, 82, 86 

polar coordinates, 241-248 

polynomial, 49 

positive definite, 112 

principal axis, 196, 197, 204 

principal dimension, 197, 204 

principal vertices, 197 

product, 6 

projection, 72, 106-113, 166-172, 
173 (Prob. 6) 

prolate spheroid, 238 

Pythagoreans, 15 

Pythagorean theorem, 34, 83, 168 


quadratic equations, 213-218, 
224-230, 231 
quadric surfaces, 231-240 


radian, 58, 63 (Prob. 5) 

radius, 35, 84 

range, 46 

rational function, 49 

rational numbers, 10 (Prob. 2), 15, 17 
ray, 57, 88 

real numbers, 5 

reflection, 52 

right circular cone, 179 
right-handed coordinate system, 82 
right-handed system of vectors, 164 
right-hand rule, 128 

right triangle, 71 

rigid motion, 52 

rose, 244, 245 

rotation, 52, 82, 162, 220-223 

rule, 46 


scalar, 93, 97 

scalar multiple, 93, 97 

scalar product, 110 

scalar triple product, 129, 150 
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secant, 69 

set, 2—4 

sine, 63 

slope, 41 

solution set, 36 

sphere, 84 

spheroid, 238 

spiral of Archimedes, 248 (Prob. 4) 

straight line, 39, 92, 138, 246 

subset, 3 

sum, 6, 94, 98 

summation symbol, 109 

surface of revolution, 237 

symmetric equations, 139, 143 
(Prob. 6) 

symmetric property, 112 

symmetry, 195, 249 (Prob. 10) 


tangent, 69 

terminal side, 57 

three-dimensional space, 80, 82 

transitive law, 12 

translation, 52, 53, 84, 85, 213-215 

transpose, 26 

triangle, 71° 

triangle inequality, 22, 114-118 

trichotomy law, 12, 14 

trigonometric addition formulas, 76, 
222 

trigonometric functions, 63-69 

triple cross product, 145 

triple of numbers, 96 

truth set, 36 


uniqueness, 7 
upper bound, 18 


value, 47, 59, 60 

vector, 93, 95 

vector addition, 94, 98 

vector identities, 145-150 
vector space, 96, 102, 104 
vertex, 179, 197, 204, 210, 211 
void set, 4 


zero vector, 95 
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